WWW::Mechanize::PhantomJS

Max Maischein

Frankfurt.pm

Overview

  • Why WWW::Mechanize::PhantomJS?

  • What is WWW::Mechanize::PhantomJS?

  • Development of WWW::Mechanize::PhantomJS

  • Applications

Who am I

  • Max Maischein

  • DZ BANK Frankfurt

  • Deutsche Zentralgenossenschaftsbank

  • Information management

Automation - My Leitmotiv

  • If I can do it manually

  • ... the computer can repeat it

  • ... correctly every time

My tools

  • Perl (obviously)

  • Host-Automation (3270, Win32::OLE)

  • WWW::Mechanize

  • WWW::Mechanize::Shell (GPW 2002)

  • WWW::Mechanize::Firefox (2010)

  • ... and now WWW::Mechanize::PhantomJS

Web 2.0

  • Web applications are still on the rise

  • Applications hold state in the client

  • Applications rely heavily on Javascript

  • Javascript is not Perl's strongest side

Javascript (I)

  • Javascript::SpiderMonkey by Mike Schili, Thomas Busch on CPAN

  • Only Javascript, no DOM

  • Javascript::Engine by Father Chrysostomos/SPROUT on CPAN

  • Pure Perl, slooow

Javascript (II)

  • Recognized platform

  • Compatible platform

  • Interactive Platform

  • WWW::Mechanize::Firefox

Interactivity is not always great

  • WWW::Mechanize::Firefox wants a UI window

  • WWW::Mechanize::Firefox wants to use my browser

Interactivity is not always great

  • WWW::Mechanize::Firefox wants a UI window

  • WWW::Mechanize::Firefox wants to use my browser

  • PhantomJS is Firefox, but without a UI

Interactivity is not always great

  • WWW::Mechanize::Firefox wants a UI window

  • WWW::Mechanize::Firefox wants to use my browser

  • PhantomJS is WebKit, but without a UI

Control

  • PhantomJS

  • ghostdriver

  • Selenium::Remote::Driver

  • WWW::Mechanize::PhantomJS

  • My program

What is WWW::Mechanize::PhantomJS

  • an extended Interface

  • of WWW::Mechanize

  • using PhantomJS as Backend

WWW::Mechanize::PhantomJS

 1:  my $mech = WWW::Mechanize::PhantomJS->new();
 2:  $mech->get('http://act.yapc.eu/ye2014');
 3:  $mech->content_as_png();

Features

  • Normal WWW::Mechanize API

  • Javascript

  • CSS selectors (via HTML::Selector::XPath)

  • XPath selectors

  • Javascript error messages!

What can I use WWW::Mechanize::PhantomJS for?

  • Automate web sites

  • Integrated JS unit tests

  • Validate user input using Javascript server-side

  • Crazy things

Live demo

Control PhantomJS

01-open-local.pl

 1:  my $mech = WWW::Mechanize::PhantomJS->new();
 2:  $mech->get_local('file.html');

Live demo

Web site usability test

02-dump-links.pl

 1:  my $mech = WWW::Mechanize::PhantomJS->new();
 2:  $mech->get_local('link.html');
 3:
 4:  sleep 5;
 5:  
 6:  print $_->get_attribute('href'),
 7:        "\n\t-> ",
 8:        $_->get_attribute('innerHTML'), "\n"
 9:    for $mech->selector('a.download');

Live demo

Execute Javascript

03-javascript.pl

 1:  // Javascript
 2:      
 3:      
 4:      
 5:      " ".join(["Just","another","Perl","Hacker"]);

Live demo

Execute Javascript

03-javascript.pl

 1:  # Perl
 2:  
 3:  
 4:  print $mech->eval_in_page(<<'JS');
 5:      " ".join(["Just","another","Perl","Hacker"]);
 6:  JS

Screenshots for documentation/logging

  • Chat application

  • Javascript+Perl

  • Server-Sent Events

  • Tests

Screenshots for documentation/logging

05-screenshot-online.pl

 1:  my $mech = WWW::Mechanize::PhantomJS->new();
 2:  my $url= 'http://mychat.dyn.datenzoo.de:5000';
 3:  print "Loading $url\n";
 4:  $mech->get($url);
 5:
 6:  show_screen;

End-to-end Test of JS app

06-send-chat.pl

 1:  $mech->get($url);
 2:
 3:  sleep 5;
 4:  # Set username
 5:    $mech->eval_in_page(<<'JS', $name);
 6:  ...

End-to-end Test of JS app

06-send-chat.pl

 1:  $mech->get($url);
 2:
 3:  sleep 5;
 4:  # Set username
 5:    $mech->eval_in_page(<<'JS', $name);
 6:        (function(name) {
 7:            set_username(name);
 8:        })(arguments[0]);
 9:  JS
10:  sleep 1;

End-to-end Test of JS app

06-send-chat.pl

 1:  # Send chat
 2:  $mech->eval_in_page(<<'JS', $msg);
 3:      (function(msg) {
 4:          $("#message").val( msg );
 5:          post_chat( document.createEvent('UIEvent') );
 6:      })(arguments[0]);
 7:  JS

End-to-end Test of JS app

06-send-chat.pl

 1:    http://www.youtube.com/v/pir_PJmOz8Q
 2:
 3:    https://twitter.com/cpan_pevans/status/503239001101586432
 4:
 5:    http://i.qkme.me/3pvsb6.jpg

Convert HTML to PDF

07-screenshot-pdf.pl

 1:  my $mech = WWW::Mechanize::PhantomJS->new();
 2:  my $url= 'http://localhost:5000';
 3:  print "Loading $url\n";
 4:  $mech->get($url);
 5:
 6:  $mech->render_content(
 7:      format => 'pdf',
 8:      filename => 'screen.pdf'
 9:  );

Prerequisites for WWW::Mechanize::PhantomJS?

  • PhantomJS

  • ghostdriver (included with module)

  • Patches for Ghostdriver to circumvent Selenium restrictions (included)

  • WWW::Mechanize

  • Selenium::Driver::Remote

What is WWW::Mechanize::PhantomJS missing?

  • API implementation (->post() , ...)

  • API extensions

  • Documentation

Missing API implementation

  • ->post()

  • Custom HTTP headers (->agent(), ... )

Need-driven development

  • Easy functions implemented first

  • Selenium is "User simulation" only

  • Selenium has no ->post() function

  • ->post() function half-implemented

  • Did not yet need it

Missing API extensions

Define an API for

  • browser windows (open, close, popup)

  • Frames (bad Selenium support)

  • Alerts (window.alert())

  • Downloads

  • Event API? Callback API?

  • List of things that happened since the last call?

Missing documentation

  • Documentation for the module API

  • WWW::Mechanize::PhantomJS

  • Documentation to answer questions

  • WWW::Mechanize::PhantomJS::Examples

  • WWW::Mechanize::PhantomJS::Troubleshooting

Missing documentation

  • Adapt ::Firefox documentation

  • WWW::Mechanize::PhantomJS::Examples

  • WWW::Mechanize::PhantomJS::Troubleshooting

  • WWW::Mechanize::PhantomJS::Installation

What is WWW::Mechanize::PhantomJS missing?

  • (A)synchronous event model

  • Asynchronous communication (AnyEvent)

  • Less Selenium

  • Less mandatory configuration (ports, ...)

Comparing ::PhantomJS with ::Firefox

 1:                 PhantomJS      Firefox
 2:  
 3:  Display        No             Yes

Comparing ::PhantomJS with ::Firefox

 1:                 PhantomJS      Firefox
 2:  
 3:  Display        No             Yes
 4:  Cookies
 5:    persistent   No             Yes

Comparing ::PhantomJS with ::Firefox

 1:                 PhantomJS      Firefox
 2:  
 3:  Display        No             Yes
 4:  Cookies
 5:    persistent   No             Yes
 6:  Custom
 7:    certificates Easy           Hard

Comparing ::PhantomJS with ::Firefox

 1:                 PhantomJS      Firefox
 2:  
 3:  Display        No             Yes
 4:  Cookies
 5:    persistent   No             Yes
 6:  Custom
 7:    certificates Easy           Hard
 8:  Dialogs        Possible       Hard

Comparing ::PhantomJS with ::Firefox

 1:                 PhantomJS      Firefox
 2:  
 3:  Display        No             Yes
 4:  Cookies
 5:    persistent   No             Yes
 6:  Custom
 7:    certificates Easy           Hard
 8:  Dialoge        Possible       Hard
 9:  alert()        Possible       Hard

A look back on the development of WWW::Mechanize::PhantomJS

A look back on the development of WWW::Mechanize::PhantomJS

The Good

  • Existing test suite of WWW::Mechanize::Firefox

  • Existing API of WWW::Mechanize

  • Experience with ::Firefox

  • 32bit App, 64bit Perl -> TCP!

A look back on the development of WWW::Mechanize::PhantomJS

The Good, the Bad

  • Selenium is ONLY for Browser"interaction"

  • Selenium doesn't like frames

  • Hacks for ghostdriver-API

  • No communication with ghostdriver developers

A look back on the development of WWW::Mechanize::PhantomJS

The Good, the Bad, the Ugly

  • API coverage through tests

  • Subtle differences between ::Firefox und ::PhantomJS

  • 100% pass until

     1:  s/::Firefox/::PhantomJS/g

Sample code

All sample code will be on CPAN as

WWW::Mechanize::PhantomJS::Examples

Thanks

Thanks

Questions?

Thanks

Questions?

Slides available at

http://corion.net/talks/

WWW::Mechanize::PhantomJS on CPAN

https://github.com/corion/www-mechanize-phantomjs on Github

Bonus Section

World's Worst Browser?

... tbd ...

Thanks

Questions?

Slides at

http://corion.net/talks/

WWW::Mechanize::PhantomJS on CPAN

https://github.com/corion/www-mechanize-phantomjs on Github