Office Automation with Perl

Max Maischein

Frankfurt.pm

Office Automation

  • Microsoft Office

  • An example using Excel

  • A short look at OpenOffice

  • In principle, these ideas are also applicable to OOo

Microsoft Excel

  • Perl is the Swiss army chainsaw of programming languages

  • Excel is the Swiss army chainsaw in all offices

Excel is everywhere

Every company uses Excel:

  • For bookkeeping

  • For data entry

  • For planning of holidays

  • Charts! Charts! Charts!

Perl and Excel

Perl works well together with Excel:

  • print join("\t", @columns), "\n";

  • DBD::CSV

  • SpreadSheet::ParseExcel

  • SpreadSheet::WriteExcel

Perl and Excel (2)

Their interaction could be better still, especially for:

  • Restructuring of tables

  • Merging of data

  • Printing of files

  • Composing of graphs

Perl and Excel (3)

Wouldn't it be nice, if Perl could control Excel?

  • When you can do it by hand

  • then surely the machine can do it better

OLE, Win32::OLE and the MS Office object model

  • Microsoft has OLE

  • Excel can completely be controlled through OLE

  • Perl has Win32::OLE

The forest for the loud trees

There's only one small problem:

  • The documentation for MS Office is huge

  • The online help is almost just as huge

  • It's faster to do something by hand, than to look it up in the documentation

Play it again, Sam

The Office Macro Recorder

  • records actions

  • plays them back again

  • and shows us the Visual Basic code

Assignment

We get a new Excel file every day.

  1. Insert row of headers

  2. The data must be forwarded to a distributor

Analysis

Breakdown into steps:

  • Doing it by hand takes at most 5 minutes

  • Every day

  • It is boring

  • What if you're sick? Or on holiday?

Get a new plan, Stan?

  • Turn on macro recording (ALT-F8)

  • ... operate Excel ...

  • Stop macro recording

  • Open macro in VB-Editor (ALT-F11)

  • Look at module1

Step 1

Step 2

Step 3

Step 4

Step 4

Step 4

Step 5

First result:

 1:    'Option Explicit 1
 2:    Rows("1:1").Select
 3:    Selection.Insert Shift:=xlDown
 4:    ActiveCell.FormulaR1C1 = "Sprache"
 5:    Rows("1:1").Select
 6:    Range("B1").Activate
 7:    ActiveCell.FormulaR1C1 = "Problem"
 8:    Rows("1:1").Select
 9:    Range("C1").Activate
10:    ActiveCell.FormulaR1C1 = "Zeilen"
11:    Range("A1").Select
12:    ActiveWorksheet.SaveAs "foo.xls"

Diversion: Excel as seen from Perl

Every Excel file is a "Book" with a lot of pages.

  • Excel can have more than one file open

  • A file is called a "Workbook"

  • ...and it contains several Worksheets

  • A Worksheet is a page with a table (for example Table 1)

Diversion: Excel as seen from Perl (2)

The list Workbooks contains all open files.

Diversion: Excel as seen from Perl (3)

Of all open files, the active file in particular is (also) accessible through the Excel property ActiveWorkbook.

Diversion: Excel as seen from Perl (4)

Within the active Workbook there also is the active worksheet, which is directly accessible though ActiveWorksheet.

Diversion: Excel as seen from Perl (5)

The currently active input cell is accessible through ActiveCell.

First result

 1:    'Option Explicit 1
 2:    Rows("1:1").Select
 3:    Selection.Insert Shift:=xlDown
 4:    ActiveCell.FormulaR1C1 = "Sprache"
 5:    Rows("1:1").Select
 6:    Range("B1").Activate
 7:    ActiveCell.FormulaR1C1 = "Problem"
 8:    Rows("1:1").Select
 9:    Range("C1").Activate
10:    ActiveCell.FormulaR1C1 = "Zeilen"
11:    Range("A1").Select
12:    ActiveWorksheet.SaveAs "foo.xls"

Diversion: syntax of Visual Basic

Short overview of the syntax:

Comments:

 1:    ' Dies ist ein Kommentar

Line continuation:

 1:    Dies ist ein Befehl, _
 2:    der über zwei Zeilen geht

Diversion: syntax of Visual Basic

Functions:

 1:    X = Left("Hello World",5)

Procedures:

 1:    Print "Hello World"

Watch out!

 1:    Print ("Hello World",1) ' Error
 2:    Print "Hello World",1   ' "Hello World1"

Object methods

Object methods:

 1:    ActiveWorkbook.SaveAs ...
 2:
 3:    $excel->ActiveWorkbook->SaveAs(...)

Diversion: Optional / named parameters

Missing/optional parameters:

 1:    .SaveAs "myfile.csv",,,xlFormatCSV
 2:    ' NO Perl equivalent
 1:    ->SaveAs("myfile.csv",,,xlFormatCSV)
 2:    # Is something else!!

Named parameters:

 1:    .SaveAs Filename := "myfile.csv" _
 2:        Format := xlFormatCSV
 1:    $wb->SaveAs( { Filename => "myfile.csv",
 2:        Format => xlFormatCSV })

Converting VB to Perl

  • Dot to ->

  • Dim Foo As Bar to my $foo;

  • Set Plonk = X to $plonk = $x

  • Active... to my $foo =

  • $Excel-> in front of all "global" calls:

     1:    Workbooks
     2:    # must be turned into
     3:    $Excel->Workbooks

Example 1: Saving a file

VB:

 1:  ActiveWorksheet.SaveAs "foo.xls"

Direct conversion:

 1:  my $Excel = Win32::OLE->new('Excel.Application');
 2:  my $Worksheet = $Excel->ActiveWorksheet;
 3:  $Worksheet->SaveAs("foo.xls");

Example 1: Mass export to CSV

VB:

 1:  ActiveWorksheet.SaveAs "foo.xls"

Improved conversion:

 1:  use Win32::OLE qw(in);
 2:  Win32::OLE->Warn(3); # croak() on OLE error
 3:
 4:  my $Excel = Win32::OLE->new('Excel.Application');
 5:  my $Worksheets = $Excel->ActiveWorkbook->Worksheets;
 6:  for my $Worksheet (in ($Worksheets)) {
 7:      $Worksheet->SaveAs(...);
 8:  };
 9:  $Excel->ActiveWorkbook->Close();

Binding of type libraries and constants

VB:

 1:  ActiveWorksheet.SaveAs "foo.csv", FileFormat := xlFormatCSV

Conversion:

 1:  ...
 2:  ...
 3:  $Worksheet->SaveAs("foo.csv", { FileFormat => xlFormatCSV });

Binding of type libraries and constants

VB:

 1:  ActiveWorksheet.SaveAs "foo.csv", FileFormat := xlFormatCSV

Conversion:

 1:  use Win32::OLE::Const 'Microsoft Excel';
 2:  ...
 3:  $Worksheet->SaveAs("foo.csv", { FileFormat => xlFormatCSV });

Constants from the Internet

Alternatively, you can also quickly look up a constant in a Google Search:

 1:  Google: xlVeryHidden

Example 2: Presentation of CSV data

A (Perl) program processes a lot of Excel files, one for each customer.

The assignment is to convert this data into a presentable form.

  • Load file

  • Enter column titles

  • Produce Pivot-Tables

  • Produce chart from tables

  • Save file as .xls

Step 1

Step 2

Step 3

Step 4

Step 4

Step 4

Step 5 - Pivot Table Wizard

Step 6

Step 7

Step 7

Step 8

Step 9

Step 10

Step 11

Not-so-live demo (Column titles)

 1:    Rows("1:1").Select
 2:    Selection.Insert Shift:=xlDown
 3:    ActiveCell.FormulaR1C1 = "Sprache"
 4:    Rows("1:1").Select
 5:    Range("B1").Activate
 6:    ActiveCell.FormulaR1C1 = "Problem"
 7:    Rows("1:1").Select
 8:    Range("C1").Activate
 9:    ActiveCell.FormulaR1C1 = "Zeilen"
10:    Range("A1").Select

Not-so-live demo (Pivot tables)

 1:  ActiveWorkbook.PivotCaches.Add(SourceType:=xlDatabase, SourceData:= _
 2:      "Tabelle1!R1C1:R10C3").CreatePivotTable TableDestination:="", TableName:= _
 3:      "PivotTable2", DefaultVersion:=xlPivotTableVersion10
 4:  ActiveSheet.PivotTableWizard TableDestination:=ActiveSheet.Cells(3, 1)
 5:  ActiveSheet.Cells(3, 1).Select
 6:  With ActiveSheet.PivotTables("PivotTable2").PivotFields("Sprache")
 7:      .Orientation = xlRowField
 8:      .Position = 1
 9:  End With
10:  With ActiveSheet.PivotTables("PivotTable2").PivotFields("Problem")
11:      .Orientation = xlColumnField
12:      .Position = 1
13:  End With
14:  ActiveSheet.PivotTables("PivotTable2").AddDataField ActiveSheet.PivotTables( _
15:      "PivotTable2").PivotFields("Zeilen"), "Summe von Zeilen", xlSum

Not-so-live demo (Chart)

 1:    Charts.Add
 2:    ActiveChart.SetSourceData Source:=Sheets("Tabelle5").Range("A3")
 3:    ActiveChart.Location Where:=xlLocationAsNewSheet
 4:    ActiveChart.ChartType = xlRadar
 5:    ActiveChart.ChartType = xlAreaStacked

Not-so-live demo (Save)

 1:    ActiveWorkbook.SaveAs Filename:= _
 2:        "C:\Dokumente und Einstellungen\xn09999\Eigene Dateien\Mappe1.xls", FileFormat _
 3:        :=xlNormal, Password:="", WriteResPassword:="", ReadOnlyRecommended:= _
 4:        False, CreateBackup:=False

Perl Programming:

Setup

 1:    my $Excel = Win32::OLE->new('Excel.Application', sub { $_[0]->Quit });
 2:
 3:    # Always use absolute path names,
 4:    # because Excel may have a different current directory
 5:    my $workbook = $Excel->Workbooks->Add($filename);
 6:
 7:    # Traditional names
 8:    my $sheet = $workbook->Worksheets("Tabelle1");

Column titles (Perl)

 1:    # Insert rows
 2:    # Rows("1:1").Select
 3:    # Selection.Insert Shift:=xlDown
 4:    $sheet->Rows("1:1")->Select->Insert( Shift => xlDown );
 5:
 6:    # ActiveCell.FormulaR1C1 = "Sprache"
 7:    # Rows("1:1").Select
 8:    $sheet->Cells("A1")->FormulaR1C1 = "Sprache";
 9:
10:    # Range("B1").Activate
11:    # ActiveCell.FormulaR1C1 = "Problem"
12:    $sheet->Cells("B1")->FormulaR1C1 = "Problem";
13:
14:    # Rows("1:1").Select
15:    # Range("C1").Activate
16:    # ActiveCell.FormulaR1C1 = "Zeilen"
17:    $sheet->Cells("C1")->FormulaR1C1 = "Zeilen";
18:    # Range("A1").Select

Slow-motion (2)

Pivot tables (Cache)

 1:    # ActiveWorkbook.PivotCaches.Add(SourceType:=xlDatabase, SourceData:= _
 2:    #    "Tabelle1!R1C1:R10C3").CreatePivotTable TableDestination:="", _
 3:    #    TableName:="PivotTable2", DefaultVersion:=xlPivotTableVersion10
 4:    my $cache = $workbook->PivotCaches->Add({
 5:        SourceType => xlDatabase, SourceData => 'Tabelle1!R1C1:R10C3'
 6:    });

Select tables

 1:    $cache->CreatePivotTable( { TableDestination => "",
 2:        TableName => "PivotTable2",
 3:        DefaultVersion => xlPivotTableVersion10
 4:    });

Slow-motion (3)

 1:    #ActiveSheet.PivotTableWizard TableDestination:=ActiveSheet.Cells(3, 1)
 2:    $sheet->PivotTableWizard(...);
 3:
 4:    #ActiveSheet.Cells(3, 1).Select
 5:    #With ActiveSheet.PivotTables("PivotTable2").PivotFields("Sprache")
 6:    #    .Orientation = xlRowField
 7:    #    .Position = 1
 8:    #End With
 9:    my $table = $sheet->PivotTables("PivotTable2");
10:    for ($table->PivotFields("Sprache")) {
11:        $_->Orientation = xlRowField;
12:        $_->Position = 1;
13:    }
14:
15:    #With ActiveSheet.PivotTables("PivotTable2").PivotFields("Problem")
16:    #    .Orientation = xlColumnField
17:    #    .Position = 1
18:    #End With
19:    for ($table->PivotFields("Problem")) {
20:        $_->Orientation = xlColumnField;
21:        $_->Position = 1;
22:    }
23:
24:    #ActiveSheet.PivotTables("PivotTable2").AddDataField ActiveSheet.PivotTables( _
25:    #    "PivotTable2").PivotFields("Zeilen"), "Summe von Zeilen", xlSum
26:    $table->AddDataField( $table->PivotFields("Zeilen"),
27:        "Summe von Zeilen", xlSum );

Slow-motion replay (4)

Chart

 1:    # Charts.Add
 2:    my $chart = $workbook->Charts->Add();
 3:    #ActiveChart.SetSourceData Source:=Sheets("Tabelle5").Range("A3")
 4:    #ActiveChart.Location Where:=xlLocationAsNewSheet
 5:    #ActiveChart.ChartType = xlAreaStacked
 6:    #ActiveChart.ChartType = xlRadar
 7:    for ($chart) {
 8:        $_->SetSourceData(...);
 9:        $_->Location( {Where => xlLocationAsNewSheet});
10:        $_->ChartType( xlRadar );
11:    }

Not-so-live demo (Save)

Save

 1:    #ActiveWorkbook.SaveAs Filename:= _
 2:    #    "C:\Dokumente und Einstellungen\xn09999\Eigene Dateien\Mappe1.xls", _
 3:    #    FileFormat:=xlNormal, Password:="", WriteResPassword:="", _
 4:    #    ReadOnlyRecommended:=False, CreateBackup:=False
 5:    $workbook->SaveAs(...);

Where else?

  • What goes for Excel, also goes for almost all other Office Products

  • Word, PowerPoint

  • Corel Draw with Corel Basic

Where not?

  • IE has an OLE interface, but no macro recorder

  • Outlook (very limited model)

  • Mozilla/FireFox

  • OpenOffice 2

Thoughtful design

Export a useful external API, and use it themselves:

  • Microsoft Office

  • eBay

  • Saves other developers a lot of frustration

OpenOffice has an external API, but it is utterly unusable.

Thank you

Thank you

Bonus Level

Bonus Level

Perl in Excel

Perl features are now also available in Excel:

The VLOOKUP function is for storing data in a table. Also a hash lookup from Perl.

 1:  =VLOOKUP(A1;B:C;2;FALSE)

is the same as:

 1:  my %b_c = (
 2:      Perl => 'http://www.perl.org',
 3:      PHP  => 'http://hardenedphp.de',
 4:      Ruby => 'http://ruby-lang.org',
 5:  );
 6:  print $b_c{ $a1 };

Oneliners with Excel files, even without MS Office

xlsperl / xlsgrep by JJ ("xlstools"):

 1:  xlsgrep "Hello" datei1.xls
 1:  xlsperl -le "s/Excelworkshop/Perlworkshop/" datei1.xls

Oneliners with Excel files, with MS Office

ExcelPerl Martin Fabiani:

 1:  excelperl -le "s/Excelworkshop/Perlworkshop/" datei1.xls

http://www.perlmonks.org/?node=excelperl

Diversion: more Visual Basic syntax

Line continuation:

 1:    Print _
 2:      "Hello World"

Compare to shell syntax:

 1:    echo \
 2:        "Hello World"

Diversion: VB string manipulation

Concatenation:

 1:    Print "Hello " & "World"

String escapes:

None!

 1:    Print "Er sagte " & Chr(34) & "Hello World" & Chr(34) & vbCrLf

The mirror universe

In Soviet Russia, Excel automates Perl

Data import into OpenOffice (2)

HTTP::Server::Simple together with external data linkage in OpenOffice 2 (+live demo?!)

  • Start server:

     1:  perl -MHTTP::Server::Simple -e "HTTP::Server::Simple->new->run"
  • Import data to OOo:

    Insert -> Linkage to external data -> URL -> http://localhost:8080/

  • Select tables

  • Done

Data import or export over ODBC

  • Problematic, because it requires Administrator privileges.

  • Doesn't work well across computer boundaries

  • Doesn't quite work across OS boundaries

OpenOffice Basic

OpenOffice Basic is OK, but the object model is terrible, as it is copied directly from Java:

 1:    sub Testmakro
 2:    rem -------------------------------------
 3:    rem define variables
 4:    dim document   as object
 5:    dim dispatcher as object
 6:    rem -------------------------------------
 7:    rem get access to the document
 8:    document   = ThisComponent.CurrentController.Frame
 9:    dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")
 1:    rem -------------------------------------
 2:    dim args1(0) as new com.sun.star.beans.PropertyValue
 3:    args1(0).Name = "StringName"
 4:    args1(0).Value = "X"
 5:
 6:    dispatcher.executeDispatch(document, ".uno:EnterString", "", 0, args1())

OpenOffice parameter passing

 1:    dim args2(1) as new com.sun.star.beans.PropertyValue
2:    args2(0).Name = "By"
3:    args2(0).Value = 1
4:    args2(1).Name = "Sel"
5:    args2(1).Value = false
6:
7:    dispatcher.executeDispatch(document, _
8:        ".uno:GoRight", "", 0, args2())

More sensible:

 1:    document.GoRight By := 1, Sel := False

OpenOffice parameter passing (Perlish)

 1:    args2(0).Name = "By"
2:    args2(0).Value = 1
3:    args2(1).Name = "Sel"
4:    args2(1).Value = false
5:
6:    dispatcher.executeDispatch(document, _
7:        ".uno:GoRight", "", 0, args2())

In Perl:

 1:    sub AUTOLOAD {
 2:        my ($obj,@args) = @_;
 3:        my ($method = $AUTOLOAD) =~ s/.*::(\w+)$/$1/g;
 4:        $dispatcher->executeDispatch($obj,".uno:$method",
 5:            "", 0, \@args);
 6:    }
 7:    $document->GoRight( By => 1, Sel => undef );