XSL-FO og FOP
Vi gjør jobben som en to-stegs operasjon:
- Vi skriver en XSLT-transformasjon, xml-to-fo.xslt, som produserer en fil som er tagget som XSL-FO (formatering), olympic.fo.
- Vi bruker et standard program, FOP, som blandt mye annet kan lage PDF fra XSL-FO dokumenter.
XSLT-transformasjonen, xml-to-fo.xslt, ser slik ut:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.1"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:output method="xml" version="1.0"
encoding="ISO-8859-1" indent="yes"/>
<xsl:template match="/">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<!-- layout for the first page -->
<fo:simple-page-master master-name="coverpage"
page-height="29.7cm"
page-width="21cm"
margin-top="7cm"
margin-bottom="2cm"
margin-left="2.5cm"
margin-right="2.5cm">
<fo:region-body margin-top="3cm" margin-bottom="1.5cm"/>
<fo:region-before extent="2cm"/>
<fo:region-after extent="1.5cm"/>
</fo:simple-page-master>
<!-- layout for all other pages -->
<fo:simple-page-master master-name="pages"
page-height="29.7cm"
page-width="21cm"
margin-top="1cm"
margin-bottom="2cm"
margin-left="2.5cm"
margin-right="2.5cm">
<fo:region-body margin-top="1cm" margin-bottom="1cm"/>
<fo:region-before extent="2cm"/>
<fo:region-after extent="1.5cm"/>
</fo:simple-page-master>
</fo:layout-master-set>
<!-- filling the front page -->
<fo:page-sequence master-reference="coverpage">
<fo:flow flow-name="xsl-region-body">
<fo:block font-weight="bold" font-size="28pt"
line-height="38pt" font-family="Times">
Olympiske resultater
</fo:block>
<fo:block font-weight="normal" font-size="13pt"
line-height="15pt" font-family="Times">
Sprintøvelsene i de siste olympiadene
</fo:block>
</fo:flow>
</fo:page-sequence>
<!-- doing all olympics in turn -->
<xsl:apply-templates select="/IOC/OlympicGame">
<xsl:sort select="@year"/>
</xsl:apply-templates>
</fo:root>
</xsl:template>
<xsl:template match="//OlympicGame">
<fo:page-sequence master-reference="pages">
<fo:flow flow-name="xsl-region-body">
<fo:block font-weight="bold" font-size="18pt"
line-height="28pt" font-family="Times"
padding-top="0cm"
border-bottom-color="black"
border-bottom-style="solid">
<xsl:element name="fo:external-graphic">
<xsl:attribute name="src">
<xsl:value-of select="@place"/>.gif</xsl:attribute>
</xsl:element>
</fo:block>
<xsl:apply-templates select="event">
<xsl:sort select="@dist"/>
</xsl:apply-templates>
</fo:flow>
</fo:page-sequence>
</xsl:template>
<xsl:template match="//event">
<fo:block font-weight="bold" font-size="12pt"
line-height="14pt" font-family="Times"
padding-top="1cm" >
<xsl:value-of select="@dist"/>
</fo:block>
<xsl:apply-templates select="athlet">
<xsl:sort data-type="number" select="result"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="//athlet">
<fo:block font-size="10pt" line-height="14pt"
font-family="Times" >
<xsl:value-of select="name"/>,
<xsl:value-of select="nation"/> :
<xsl:value-of select="result"/>
</fo:block>
</xsl:template>
</xsl:stylesheet>
Vi forutsetter at FOP er installert i katalogen fop, og kan kalle FOP fra kommandolinja slik:
c:\fop\fop.bat olympic.fo olympic.pdf
PRINCE
Vi gjør følgende:
Transformasjonen som lage html er i all hovedsak den samme som den som er brukt i modulen XML2HTML :
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"
version="1.0"
encoding="ISO-8859-1"
indent="yes"
doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
<xsl:template match="/">
<html>
<head>
<title>Olympics</title>
</head>
<body>
<h1>Resultater sprint</h1>
<p>fra de siste olympiske leker</p>
<hr/>
<xsl:apply-templates select="IOC/OlympicGame">
<xsl:sort select="@year" order="ascending"/>
</xsl:apply-templates>
</body>
</html>
</xsl:template>
<xsl:template match="OlympicGame">
<table cellpadding="10">
<tr>
<td>
<xsl:element name="img">
<xsl:attribute name="src">
<xsl:value-of select="@place"/>.gif</xsl:attribute>
<xsl:attribute name="alt">
<xsl:value-of select="@place"/></xsl:attribute>
</xsl:element>
</td>
<td>
<h1><xsl:value-of select="@place"/> <br/>
<xsl:value-of select="@year"/></h1>
</td>
</tr>
</table>
<table cellpadding="10" border="0" cellspacing="0">
<tr>
<xsl:apply-templates select="event"/>
</tr>
</table>
</xsl:template>
<xsl:template match="//event">
<td valign="top">
<h2><xsl:value-of select="@dist"/></h2>
<xsl:apply-templates select="athlet">
<xsl:sort data-type="number" select="result"/>
</xsl:apply-templates>
</td>
</xsl:template>
<xsl:template match="athlet">
<p><xsl:value-of select="name"/><br/>
<xsl:value-of select="nation"/><br/>
<xsl:value-of select="result"/></p>
</xsl:template>
</xsl:stylesheet>
Resultatet, prepared.html, er slik:
CSS-fila som beskriver layout til PDF-fila er svært enkel (printpages.css):
@page { size: A4;
margin: 100pt 40pt 40pt 90pt;
@top-left {
content:"demo";
}
@top-right {
content:"Markup og Web";
font-size:24px;
}
@bottom-right {
content: counter(page);
font-style: italic;
font-size:11px;
border-top-style:solid;
border-top-width:thin;
}
@bottom-left {
content:"B. Stenseth";
font-style: italic;
font-size:11px;
border-top-style:solid;
border-top-width:thin;
}
}
h1{page-break-before:always}
Resultatet, prepared.pdf, er slik:
Python, lxml og Prince
Vi skal gjøre følgende sammensatte transformasjon
Vi begynner med å skrive en enkel transformasjon, tohtml.xsl, som lager en HTML-fil. det som i hovedsak skiller denne transformasjonen fra den vi så på i avsnittet over er at denne gangen lager vi en innholsdfortegnelse.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"
version="1.0" omit-xml-declaration="yes"
encoding="UTF-8"
indent="no"/>
<xsl:template match="/">
<xsl:text disable-output-escaping='yes'><!DOCTYPE html>
</xsl:text>
<html>
<head>
<title>Olympiade</title>
<link href="olscreen.css" rel="stylesheet" />
<link href="olprint.css" rel="stylesheet" />
<link href="olprojection.css" rel="stylesheet" />
</head>
<body>
<div id="heading">Olympiske sprintresultater</div>
<xsl:call-template name="toc"/>
<xsl:apply-templates select="IOC/OlympicGame">
<xsl:sort select="@year" order="ascending"/>
</xsl:apply-templates>
</body>
</html>
</xsl:template>
<xsl:template match="OlympicGame">
<xsl:element name="a">
<xsl:attribute name="name"><xsl:value-of select="@place"/></xsl:attribute>
<h1>
<xsl:value-of select="@place"/> - <xsl:value-of select="@year"/>
</h1>
</xsl:element>
<div>
<xsl:element name="img">
<xsl:attribute name="src"><xsl:value-of select="@place"/>.gif</xsl:attribute>
<xsl:attribute name="alt"><xsl:value-of select="@place"/></xsl:attribute>
</xsl:element>
</div>
<xsl:apply-templates select="event"/>
</xsl:template>
<xsl:template match="event">
<h2><xsl:value-of select="@dist"/></h2>
<xsl:apply-templates select="athlet">
<xsl:sort data-type="number" select="result"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="athlet">
<div class="athlet">
<p><xsl:value-of select="name"/></p>
<p><xsl:value-of select="nation"/></p>
<p><xsl:value-of select="result"/></p>
</div>
</xsl:template>
<xsl:template name="toc">
<xsl:element name="div">
<xsl:attribute name="id">maintoc</xsl:attribute>
<div class="tocheader">Innhold</div>
<xsl:for-each select="//OlympicGame">
<xsl:sort select="@year" order="ascending"/>
<div class="toclevel1">
<xsl:element name="a">
<xsl:attribute name="href">#<xsl:value-of select="@place"/></xsl:attribute>
<xsl:value-of select="@place"/>
</xsl:element>
</div>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Resultatet, bok.html, av transformasjonen er slik:
Så lager vi noen Pythonmoduler som skal forestå den kombinerte opersjonen:
- transformasjon XML - > HTML
- kall på Prince for å lage PDF
Selve transformasjonen, transform.py, gjøres slik:
"""
Transforming XML to HTML using lxml
"""
from lxml import etree
def produce(xmlfile,xsltfile):
xmlTree=etree.parse(xmlfile)
xsltTree=etree.parse(xsltfile)
transform=etree.XSLT(xsltTree)
resultTree=transform(xmlTree)
return str(resultTree)
I modulen, makesingle.py, nedenfor er det metoden: doSinglePageJob som anvender Prince.
"""
The purpose of this module is to make a PDF-files from a HTML file
One to One
The converterengine is PrinceXML
Parameters to this module when run from the commandline
is any number of HTML-files.
The PDF files will have same name, but pdf as extension
"""
import subprocess
import sys
import utils
import transform
#--------------------
# fixed paths and logging
""" catalog """
cat='c:\\web\\dw\\olymp\\'
""" prince path """
princepath='c:\\fixed\\prince\\engine\\bin\\prince.exe'
""" log file """
logfile=cat+'ol2pdfprince2\\princelog.txt'
""" print log after job """
printlog=False
""" full report """
verbose=False
""" all stylesheets """
stylesheets=[cat+'ol2pdfprince2\\olsheet1.css']
""" erase log file """
def eraseLog():
utils.storeTextFile(logfile,'')
"""
Do one page to one page
"""
def doSinglePageJob(infile,outfile):
print infile+' -> '+outfile
params=[princepath,infile,'-o '+outfile,'--log='+logfile]
if verbose:
params.append('-v')
for style in stylesheets:
params.append("-s "+style)
print 'making: '+outfile
subprocess.call(params)
#--------------------------------
if __name__=="__main__":
T=transform.produce(cat+'all_results.xml',
cat+'ol2pdfprince2\\tohtml.xsl')
utils.storeTextFile(cat+'ol2pdfprince2\\bok.html',T)
doSinglePageJob(cat+'ol2pdfprince2\\bok.html',
cat+'ol2pdfprince2\\bok.pdf')
Modulen utils inneholder bare to metoder for filaksess:
""" load a text file """
def getTextFile(filename):
try:
file=open(filename,'r')
intext=file.read()
file.close()
return intext
except:
print 'Error reading file ',filename
return ''
""" store a text file """
def storeTextFile(filename,txt):
try:
file=open(filename,'w')
file.write(txt)
file.close()
except:
print 'Trouble writing to: '+filename
Stilarket, olsheet1.css, som brukes til PDF-produksjonen ser slik ut:
@page { size: A4;
margin: 100pt 40pt 40pt 90pt;
@top-left {
content:url(http://www.ia.hiof.no/~borres/common/gfx/printlogo.gif);
}
@top-right {
content: string(doctitle);
font-size:18px;
}
@bottom-right {
content: counter(page);
font-style: italic;
font-size:11px;
border-top-style:solid;
border-top-width:thin;
}
@bottom-left {
content:"B. Stenseth";
font-style: italic;
font-size:11px;
border-top-style:solid;
border-top-width:thin;
}
}
@page:first {
@top-left {
content:url(http://www.ia.hiof.no/~borres/common/gfx/printlogo_txt.gif);
}
@top-right {
content:"";
font-size:24px;
}
}
#heading{margin-top:50px;margin-bottom:50px;font-weight:bold;font-size:36px}
h1 { string-set: doctitle content() }
h1,h2 {page-break-before:always}
#maintoc a::after { content: leader(".") target-counter(attr(href), page); }
.tocheader{margin-top:50px;margin-bottom:50px;font-weight:bold;font-size:20px}
.toclevel1{margin-left:20px;line-height:150%}
.athlet p{line-height:70%}
.athlet :first-child{font-weight:bold}
/* linking NB: sequence is important */
a:link {color:black;text-decoration:none}
a:visited {color:black;text-decoration:none}
a:hover {color:black;text-decoration:none}
a:active {color:black;text-decoration:none}
Stilarkene nedenfor benyttes til skjerm, print og projection (F11 i Opera)
@media screen
{
h1{color:red}
}
@media print
{
h1{color:blue;page-break-before:always}
}
@media projection
{
#maintoc,img{display:none}
h1,h2{page-break-before:always}
.athlet {line-height:60%;margin-left:150px;margin-top:40px;}
.athlet :first-child{font-weight:bold;font-size:20px}
#heading{margin-left:150px;margin-top:150px;font-size:46px}
}
Resultatet, bok.pdf, er slik: