Diaries of a Desperate XMLXProc Hacker Diaries of

  • Slides: 46
Download presentation
Diaries of a Desperate (XML|XProc) Hacker

Diaries of a Desperate (XML|XProc) Hacker

Diaries of a Desperate (XML|XProc) Hacker James Fuller Lead Engineer | Mark. Logic

Diaries of a Desperate (XML|XProc) Hacker James Fuller Lead Engineer | Mark. Logic

Background • Engineer on Mark. Logic API team (History meters, Management API, etc…) •

Background • Engineer on Mark. Logic API team (History meters, Management API, etc…) • W 3 C XML Processing WG (XProc v 2. 0) • 2001 started with XML tech (EXSLT), XML Prague, etc… • Open source contrib. • Thank you to the organisers of XProc XML London 2015

Agenda 1. 2. 3. 4. 5. 6. XML Hacker Desperation XMLCalabash & depify Show

Agenda 1. 2. 3. 4. 5. 6. XML Hacker Desperation XMLCalabash & depify Show & Tell XProc Hacker Desperation Summary Goto pub * Yes, I am going to ‘powerpoint’ you * Raise your hand to ask question

Email !!! The D. P. H. xkcd. com - http: //xkcd. com/208/ [xkcd-ref]

Email !!! The D. P. H. xkcd. com - http: //xkcd. com/208/ [xkcd-ref]

D. P. H. – a twinkling in SGML eye • Desperate Perl Hacker –

D. P. H. – a twinkling in SGML eye • Desperate Perl Hacker – Paul Grosso 1997 xml-dev link – Google images ‘desperate perl hacker’ link – Etymological cousin of ‘Just Another Perl Hacker’ (JAPH) – Randal Schwartz aka Merlin • What’s it all about ? – GSD – Opaque One liners (Perl Golf encouraged) – Even better if (regex|pipes|sed|awk) involved – Challenge: Be able to munge XML with Perl

Desperate XML Hacker • GAD (Get it All Done) with XML Stack • ‘clever’

Desperate XML Hacker • GAD (Get it All Done) with XML Stack • ‘clever’ (and|or) ‘clear’ • Highly productive, albeit marooned anxious on ‘XML island’ • Working with xml means working with documents and that means working with document workflows

All programmers are desperate marklogic emacs ant xml xpath json xslt emacs java xquery

All programmers are desperate marklogic emacs ant xml xpath json xslt emacs java xquery gradle bash …. .

 • • • Day 1 - transform an xml doc with XSLT Day

• • • Day 1 - transform an xml doc with XSLT Day 2 - run transform on set of docs Day 3 - generate multiple output formats Day 4 - read docs from database Day 5 - put results into database Day 6 - notify when its done Day 7 - run assertions and validate results Day 8 - generate png from svg for each document Day 8 - zip up files and upload them (w/ oauth) Day 9 - create EPub And so forth …

Technology Selection – XSLT – XQuery – Bash scripts – Makefiles – Ant –

Technology Selection – XSLT – XQuery – Bash scripts – Makefiles – Ant – Java – All of the above ?

TRANSFORM GENERATE zip PACKAGE notify upload Adhoc pipelines

TRANSFORM GENERATE zip PACKAGE notify upload Adhoc pipelines

Pipelines manage complexity [Mc. Grath 2004] Sean Mc. Grath. Performing impossible feats of XML

Pipelines manage complexity [Mc. Grath 2004] Sean Mc. Grath. Performing impossible feats of XML processing with pipelining, Proc XML Open 2004, • Transformation decomposition is the key to complexity management, just ask: – Henry Ford – Herbert Simon (The Two Watchmakers – “The Architecture of Complexity”) – George Miller (7+/-2) – Adam Smith (An Inquiry into the Nature And Causes of the Wealth of Nations, 1776) – Any electrical/chemical engineer – Michael A. Jackson • Easy to build, test and reuse • Segregation of business rules from grammar rules • Enable group collaboration

Michael Kay Balisage 2009 – ‘You Pull, I’ll Push: on the Polarity of Pipelines’

Michael Kay Balisage 2009 – ‘You Pull, I’ll Push: on the Polarity of Pipelines’ • ‘the code of each step in the pipeline is kept very simple’ • ‘very easy to assemble an application from a set of components, thus maximizing the potential for component reuse’ • ‘there is no requirement that each step in a pipeline should use the same technology; it's easy to mix XSLT, XQuery, Java and so on in different stages. ’ http: //www. balisage. net/Proceedings/vol 3/html/Kay 01/Balisage. Vol 3 -Kay 01. html

Use all the XML technologies …

Use all the XML technologies …

XML – The Good Parts Modern XML Tier 1 Modern XML Tier 2 Core

XML – The Good Parts Modern XML Tier 1 Modern XML Tier 2 Core XML 1. 0 Namespaces XPATH 1. 0/2. 0/3. 0 XML Canonicalization Transform/ Query XSLT 1. 0/2. 0/3. 0 XQuery 1. 0/3. 0 XSLT 1. 0/2. 0 (in browser) Processing SAX, DOM XProc? , XOM Other XML Catalog XForms Schematron XML Schema 1. 0 RELAX-NG XML Schema 1. 1 Semantics RDF OWL SPARQL Update Vocabularies* SVG ‘Office’ Doc ML …. Math. ML Docbook DITA XHTML - Amended from XML Amsterdam 2012 Keynote

Dependency Adoption (technology selection)

Dependency Adoption (technology selection)

Dependency Adoption Helter skelter

Dependency Adoption Helter skelter

http: //upload. wikimedia. org/wikipedia/comm ons/thumb/b/ba/Helter_skelter. jpg/440 px. Helter skelter Helter_skelter. jpg Its more like

http: //upload. wikimedia. org/wikipedia/comm ons/thumb/b/ba/Helter_skelter. jpg/440 px. Helter skelter Helter_skelter. jpg Its more like this

The right Tool

The right Tool

Obligatory Jedi slide

Obligatory Jedi slide

But it works!

But it works!

Java and XML

Java and XML

xml: Father- "XML gives Java something to do. ” • XML, Java, and the

xml: Father- "XML gives Java something to do. ” • XML, Java, and the future of the Web 1997, Jon Bosak - http: //www. ibiblio. org/pub/suninfo/standards/xml/why/xmlapps. htm • SAX, DOM • Unicode support • Distributed • Caring and feeding of java vm • Invoke abstraction (classpath, jar fun)

Do Java and XML work better together?

Do Java and XML work better together?

Not enough time

Not enough time

Not enough time

Not enough time

Desire to be Productive

Desire to be Productive

10 x programmers is not a myth • • • • Augustine, N. R.

10 x programmers is not a myth • • • • Augustine, N. R. 1979. "Augustine’s Laws and Major System Development Programs. " Defense Systems Management Review: 50 -76. Boehm, Barry W. , and Philip N. Papaccio. 1988. "Understanding and Controlling Software Costs. " IEEE Transactions on Software Engineering SE-14, no. 10 (October): 1462 -77. Boehm, Barry, et al, 2000. Software Cost Estimation with Cocomo II, Boston, Mass. : Addison Wesley, 2000. Boehm, Barry W. , T. E. Gray, and T. Seewaldt. 1984. "Prototyping Versus Specifying: A Multiproject Experiment. " IEEE Transactions on Software Engineering SE-10, no. 3 (May): 290 -303. Also in Jones 1986 b. Card, David N. 1987. "A Software Technology Evaluation Program. " Information and Software Technology 29, no. 6 (July/August): 291 -300. Curtis, Bill. 1981. "Substantiating Programmer Variability. " Proceedings of the IEEE 69, no. 7: 846. Curtis, Bill, et al. 1986. "Software Psychology: The Need for an Interdisciplinary Program. " Proceedings of the IEEE 74, no. 8: 1092 -1106. De. Marco, Tom, and Timothy Lister. 1985. "Programmer Performance and the Effects of the Workplace. " Proceedings of the 8 th International Conference on Software Engineering. Washington, D. C. : IEEE Computer Society Press, 268 -72. De. Marco, Tom and Timothy Lister, 1999. Peopleware: Productive Projects and Teams, 2 d Ed. New York: Dorset House, 1999. Mills, Harlan D. 1983. Software Productivity. Boston, Mass. : Little, Brown. Sackman, H. , W. J. Erikson, and E. E. Grant. 1968. "Exploratory Experimental Studies Comparing Online and Offline Programming Performance. " Communications of the ACM 11, no. 1 (January): 3 -11. Valett, J. , and F. E. Mc. Garry. 1989. "A Summary of Software Measurement Experiences in the Software Engineering Laboratory. " Journal of Systems and Software 9, no. 2 (February): 137 -48. Weinberg, Gerald M. , and Edward L. Schulman. 1974. "Goals and Performance in Computer Programming. " Human Factors 16, no. 1 (February): 70 -77.

Except when it is a myth • technical debt – Maintainable/Upgrade – Add new

Except when it is a myth • technical debt – Maintainable/Upgrade – Add new features – Enterprise requirements • more bugs • brittle code Upfront design Technology selection Balancing trade-offs to achieve sum gain

reflection • Desperate people do desperate things – Use all the XML technologies –

reflection • Desperate people do desperate things – Use all the XML technologies – Dependency adoption – Not the right tool – Not enough time – Being productive

avoid being a D. X. H. • Careful technology selection • Manage your dependencies

avoid being a D. X. H. • Careful technology selection • Manage your dependencies • Avoid distributing logic up/down/across tech stack (hint: don’t use bash, makefiles, ant, etc) • Simplify interaction with Java (VM) • Model pipelines (hint: XProc)

avoid being a D. X. H. • Use XProc (XMLCalabash) – XProc is designed

avoid being a D. X. H. • Use XProc (XMLCalabash) – XProc is designed for XML processing pipelines – Extensible – Simplify and aggregate logic • Use XProc extension steps (depify) – XProc w/o extension steps is half of XProc – Provide façade over other technologies

We use pipelines John Lumley – worked with DITA OT Sandro Cirulli - workflow

We use pipelines John Lumley – worked with DITA OT Sandro Cirulli - workflow (pull scm, push db, process) Nic Gibson – conversion workflows Philip Fearon - types of workflows (seq and concurrent) with XMLFlow • Andrew Sales – schematron on word docs (used Ant) • …. • • • most talks mentioned workflow/pipeline – ~100 mentions in proceedings – guestimate ~6 mentions per hour during the talks

Desperate XProc Hacker • XProc learning curve – v 1. 0 verbose in places

Desperate XProc Hacker • XProc learning curve – v 1. 0 verbose in places – XProc generic by design – Some ‘Batteries not included’ • XProc v 2. 0 addresses this – – – – Simplify connecting steps Simplify parameters (maps) Flow control Metadata Anything ‘flows’ avt/tvt Syntactic optimisations • depify provides a way to distribute and reuse extension steps beats the problems that arise using ‘hairball’ approach

XMLCalabash & depify • XMLCalabash – XProc processor – Norm Walsh – http: //xmlcalabash.

XMLCalabash & depify • XMLCalabash – XProc processor – Norm Walsh – http: //xmlcalabash. com/ • depify – XProc dependency management – http: //depify. com/

XMLCalabash extension steps

XMLCalabash extension steps

package com. example. library; import com. xmlcalabash. library. Default. Step; … elided … import

package com. example. library; import com. xmlcalabash. library. Default. Step; … elided … import com. xmlcalabash. runtime. XAtomic. Step; @XMLCalabash( name = "ex: hello-world", type = "{http: //example. org/xmlcalabash/steps}hello-world") public class Hello. World extends Default. Step { private Writable. Pipe result = null; public Hello. World(XProc. Runtime runtime, XAtomic. Step step) { super(runtime, step); } public void set. Output(String port, Writable. Pipe pipe) { result = pipe; } public void reset() { result. reset. Writer(); } public void run() throws Saxon. Api. Exception { super. run(); … elided … tree. add. Text("Hello World"); … elided … result. write(tree. get. Result()); } }

Library for the step <p: library version="1. 0" xmlns: p="http: //www. w 3. org/ns/xproc"

Library for the step <p: library version="1. 0" xmlns: p="http: //www. w 3. org/ns/xproc" xmlns: c="http: //www. w 3. org/ns/xproc-step" xmlns: ex="http: //example. org/xmlcalabash/steps"> <p: declare-step type="ex: hello-world"> <p: output port="result"/> </p: declare-step> </p: library>

library xpl included in jar M Filemode Length Date Time File - ---------- ---------------------------drwxr-xr-x

library xpl included in jar M Filemode Length Date Time File - ---------- ---------------------------drwxr-xr-x 0 8 -Mar-2015 10: 43: 38 META-INF/ -rw-r--r-843 8 -Mar-2015 10: 43: 38 META-INF/MANIFEST. MF drwxr-xr-x 0 8 -Mar-2015 10: 43: 38 com/example/library/ -rw-r--r-- 2062 8 -Mar-2015 10: 43: 38 com/example/library/Hello. World. class drwxr-xr-x 0 8 -Mar-2015 10: 43: 38 META-INF/annotations/ -rw-r--r-31 8 -Mar-2015 10: 43: 38 METAINF/annotations/com. xmlcalabash. core. XMLCalabash -rw-r--r-294 19 -Feb-2015 15: 41: 00 example-library. xpl - ---------- ---------------------------3230 9 files

depify • depify. com • depify client • depify github

depify • depify. com • depify client • depify github

 • • Usage of XMLCalabash Usage of depify Develop your own step Distribute

• • Usage of XMLCalabash Usage of depify Develop your own step Distribute with depify

depify future • Gradle plugin • Depify into other repos to enable day zero

depify future • Gradle plugin • Depify into other repos to enable day zero bootstrap (w/ yum, etc) • Integration (expath package management) • More steps

Summary XProc extension steps provide reuse XProc v 2. 0 lets you work in

Summary XProc extension steps provide reuse XProc v 2. 0 lets you work in broader context Pipelines manage complexity depify specifically built for XProc (XMLcalabash) • Reuse with existing mechanisms (ex. Maven) • •

How to Become a Delighted XProc Hacker • Stop using bash, makefiles, ant or

How to Become a Delighted XProc Hacker • Stop using bash, makefiles, ant or bending XML tech to control main loop • Stop making adhoc pipelines • • • model pipelines with XProc (XMLCalabash) try out ext steps (depify) GSD reuse and distribute new steps (depify) goto pub

Thank you for your attention and time, questions ? <pub/>

Thank you for your attention and time, questions ? <pub/>