Why WWW::Mechanize::PhantomJS?
What is WWW::Mechanize::PhantomJS?
Development of WWW::Mechanize::PhantomJS
Applications
Max Maischein
DZ BANK Frankfurt
Deutsche Zentralgenossenschaftsbank
Information management
If I can do it manually
... the computer can repeat it
... correctly every time
Perl (obviously)
Host-Automation (3270, Win32::OLE)
WWW::Mechanize
WWW::Mechanize::Shell (GPW 2002)
WWW::Mechanize::Firefox (2010)
... and now WWW::Mechanize::PhantomJS
Web applications are still on the rise
Applications hold state in the client
Applications rely heavily on Javascript
Javascript is not Perl's strongest side
Javascript::SpiderMonkey by Mike Schili, Thomas Busch on CPAN
Only Javascript, no DOM
Javascript::Engine by Father Chrysostomos/SPROUT on CPAN
Pure Perl, slooow
Recognized platform
Compatible platform
Interactive Platform
WWW::Mechanize::Firefox
WWW::Mechanize::Firefox wants a UI window
WWW::Mechanize::Firefox wants to use my browser
WWW::Mechanize::Firefox wants a UI window
WWW::Mechanize::Firefox wants to use my browser
PhantomJS is Firefox, but without a UI
WWW::Mechanize::Firefox wants a UI window
WWW::Mechanize::Firefox wants to use my browser
PhantomJS is WebKit, but without a UI
PhantomJS
ghostdriver
Selenium::Remote::Driver
WWW::Mechanize::PhantomJS
My program
an extended Interface
of WWW::Mechanize
using PhantomJS as Backend
1: my $mech = WWW::Mechanize::PhantomJS->new();
2: $mech->get('http://act.yapc.eu/ye2014');
3: $mech->content_as_png();
Normal WWW::Mechanize API
Javascript
CSS selectors (via HTML::Selector::XPath)
XPath selectors
Javascript error messages!
Automate web sites
Integrated JS unit tests
Validate user input using Javascript server-side
Crazy things
Control PhantomJS
01-open-local.pl
1: my $mech = WWW::Mechanize::PhantomJS->new();
2: $mech->get_local('file.html');
Web site usability test
02-dump-links.pl
1: my $mech = WWW::Mechanize::PhantomJS->new();
2: $mech->get_local('link.html');
3:
4: sleep 5;
5:
6: print $_->get_attribute('href'),
7: "\n\t-> ",
8: $_->get_attribute('innerHTML'), "\n"
9: for $mech->selector('a.download');
Execute Javascript
03-javascript.pl
1: // Javascript 2: 3: 4: 5: " ".join(["Just","another","Perl","Hacker"]);
Execute Javascript
03-javascript.pl
1: # Perl 2: 3: 4: print $mech->eval_in_page(<<'JS'); 5: " ".join(["Just","another","Perl","Hacker"]); 6: JS
Chat application
Javascript+Perl
Server-Sent Events
Tests
05-screenshot-online.pl
1: my $mech = WWW::Mechanize::PhantomJS->new(); 2: my $url= 'http://mychat.dyn.datenzoo.de:5000'; 3: print "Loading $url\n"; 4: $mech->get($url); 5: 6: show_screen;
06-send-chat.pl
1: $mech->get($url); 2: 3: sleep 5; 4: # Set username 5: $mech->eval_in_page(<<'JS', $name); 6: ...
06-send-chat.pl
1: $mech->get($url);
2:
3: sleep 5;
4: # Set username
5: $mech->eval_in_page(<<'JS', $name);
6: (function(name) {
7: set_username(name);
8: })(arguments[0]);
9: JS
10: sleep 1;
06-send-chat.pl
1: # Send chat
2: $mech->eval_in_page(<<'JS', $msg);
3: (function(msg) {
4: $("#message").val( msg );
5: post_chat( document.createEvent('UIEvent') );
6: })(arguments[0]);
7: JS
06-send-chat.pl
1: http://www.youtube.com/v/pir_PJmOz8Q 2: 3: https://twitter.com/cpan_pevans/status/503239001101586432 4: 5: http://i.qkme.me/3pvsb6.jpg
07-screenshot-pdf.pl
1: my $mech = WWW::Mechanize::PhantomJS->new(); 2: my $url= 'http://localhost:5000'; 3: print "Loading $url\n"; 4: $mech->get($url); 5: 6: $mech->render_content( 7: format => 'pdf', 8: filename => 'screen.pdf' 9: );
PhantomJS
ghostdriver (included with module)
Patches for Ghostdriver to circumvent Selenium restrictions (included)
WWW::Mechanize
Selenium::Driver::Remote
API implementation (->post() , ...)
API extensions
Documentation
->post()
Custom HTTP headers (->agent(), ... )
Easy functions implemented first
Selenium is "User simulation" only
Selenium has no ->post() function
->post() function half-implemented
Did not yet need it
Define an API for
browser windows (open, close, popup)
Frames (bad Selenium support)
Alerts (window.alert())
Downloads
Event API? Callback API?
List of things that happened since the last call?
Documentation for the module API
WWW::Mechanize::PhantomJS
Documentation to answer questions
WWW::Mechanize::PhantomJS::Examples
WWW::Mechanize::PhantomJS::Troubleshooting
Adapt ::Firefox documentation
WWW::Mechanize::PhantomJS::Examples
WWW::Mechanize::PhantomJS::Troubleshooting
WWW::Mechanize::PhantomJS::Installation
(A)synchronous event model
Asynchronous communication (AnyEvent)
Less Selenium
Less mandatory configuration (ports, ...)
1: PhantomJS Firefox 2: 3: Display No Yes
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes 6: Custom 7: certificates Easy Hard
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes 6: Custom 7: certificates Easy Hard 8: Dialogs Possible Hard
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes 6: Custom 7: certificates Easy Hard 8: Dialoge Possible Hard 9: alert() Possible Hard
The Good
Existing test suite of WWW::Mechanize::Firefox
Existing API of WWW::Mechanize
Experience with ::Firefox
32bit App, 64bit Perl -> TCP!
The Good, the Bad
Selenium is ONLY for Browser"interaction"
Selenium doesn't like frames
Hacks for ghostdriver-API
No communication with ghostdriver developers
The Good, the Bad, the Ugly
API coverage through tests
Subtle differences between ::Firefox und ::PhantomJS
100% pass until
1: s/::Firefox/::PhantomJS/g
All sample code will be on CPAN as
WWW::Mechanize::PhantomJS::Examples
Questions?
Questions?
Slides available at
WWW::Mechanize::PhantomJS on CPAN
... tbd ...
Questions?
Slides at
WWW::Mechanize::PhantomJS on CPAN