Build a Word Document from RPG with a Table
Date Posted: June 23, 2011 12:00 AM

Q: I read your article "How to Create a Word Document in RPG" (March 10, 2011, article ID 65833). Great article! Works like a charm for single values in a form-letter approach. But how would you insert a list of data? For example, how can I insert a table containing a price list for a customer, when the number of items isn't known at compile time, so a simple find and replace won't work?

A: To do that, you need to look at the structure of the XML document and write the appropriate XML to build the table. In this article, I demonstrate how to do that.

Review

In the previous article, I explained that .docx files are actually .zip files, and inside the file is a directory structure, containing multiple XML files and potentially other types of files as well.

Although there's a lot of information in a Word document, the core data of the document itself is kept in a single XML file stored in the word/document.xml file inside the archive.

This word/document.xml file is kept in UTF-8 (CCSID 1208) format. If you unzip the .docx file, you can open the word/document.xml file by using the IFS APIs, and you can manipulate its contents any way that you like.

Structure of a Table in Word

Personally, I find the XML inside a Word document to be a bit overwhelming because there are so many details stored in a Word document, and therefore there are so many tags, that I quickly get lost. I don't attempt to fully understand a Word document, but instead, I start by creating a document inside Word itself, and then I just add my data inside of it. So I don't need to fully understand the XML, but I need to understand some of it because it'll have to be repeated for each row I want to add to the table in my Word document.

To help you get the basic idea of the structure of a word document, start by looking at a very simple one (without a table):

<w:document>
  <w:body>
    <w:p>
      <w:r>
        <w:t> text data </w:t>
      </w:r>
    </w:p>
  </w:body>
</w:document>

This is a Word document that contains the words 'text data', and nothing else. The following are the XML elements in the document:

w:document The entire Word document is always inside this tag.
w:body The body (or data) of the document.
w:p A paragraph
w:r A "run" of text within a paragraph.
w:t This is the text. Always occurs inside a <w:r> tag.

When a there's a table, there's a <w:tbl> tag inside the body of the Word document. The paragraph, run, and text data are located inside the cells of the table. The basic structure of a table looks like this:

<w:tbl>
  <w:tr>
    <w:tc> ... row 1, column 1 ... </w:tc>
    <w:tc> ... row 1, column 2 ... </w:tc>
  </w:tr>
  <w:tr>
    <w:tc> ... row 2, column 1 ... </w:tc>
    <w:tc> ... row 2, column 2 ... </w:tc>
  </w:tr>
</w:tbl>
w:tbl identifies a table
w:tr identifies a table row
w:tc identifies a cell within a row

Within each of these cells will be the paragraph, run, and text elements. There can also be other elements that describe the formatting of the data, the size of the columns, and other table, row, column, paragraph, run, and text properties. Properties are identified by the tag name followed by Pr, so you might have:

w:tblPr table properties
w:trPr row properties
w:tcPr cell properties
w:pPr paragraph properties
w:rPr run properties
w:tPr text properties

These are only some of the XML tags that can exist within the document, and so far my document is already getting cluttered and hard to read because there are so many details in it. If you've followed me so far (and I wouldn't blame you for being lost, but please bear with me), the tags fit together like this:

<w:document>
  <w:body>
    <w:bodyPr> .. body properties .. </w:bodyPr>
    <w:tbl>
      <w:tblPr> .. table properties .. </w:tblPr>
      <w:tr>
        <w:trPr> .. row properties .. </w:tblPr>
        <w:tc>
          <w:tcPr> .. column properties .. </w:tcPr>
          <w:p>
             <w:pPr> .. paragraph properties .. </w:pPr>
             <w:r>
               <w:rPr> .. text run properties .. </w:rPr>
               <w:t>
                 <w:tPr> .. text properties .. </w:tPr>
...... your cell's text data goes here ......
               </w:t>
             </w:r>
          </w:p>
        </w:tc>
        .. more columns go here ..
      </w:tr>
      .. more rows go here ..
    </w:tbl>
  </w:body>
</w:document>

That's only enough XML for a single cell of a table. You see what I mean? It gets overwhelming. So I'm going to create my document in Word rather than try to generate all the proper XML in my program. It's just easier that way, but then I'll find the rows and columns in the table, and I'll duplicate them with my data.

Creating a Word Document Template

Because I've decided to let Word generate the XML, I fire up Word 2007 (yes, I'm still using 2007!) and create a template for my price list. I know my RPG program will need to fill in the price details, so in Word I put placeholders where I want the RPG program to insert data.

Here's what I came up with for my price list:


(Click image for a larger view)

It's a very simple Word document, and it took me only a few minutes to make—but, the first thing I notice is how much nicer it looks than my traditional RPG reports. I can see why you asked this question!

In my shop, each item that we sell is called a "product," and this particular price list will contain prices for Direct Store Delivery (DSD) in zones 1, 2, and 3, as well as Distributer/Jobber (JOB) pricing for zones 1 and 2. I'm not certain how other people's price lists are organized, but that's the way we organize ours where I work, at Klement's Sausage. You can adapt it according to your own needs and tastes, of course.

I decided to have a light blue background behind every alternating row—much like the blue bar paper we printed on years ago. It makes it a little easier for people's eyes to follow across the rows. That means my template needs a "shaded" row and an "unshaded" row. These will, of course, correspond to <w:tr> elements in the resulting XML document.

The PROD, DSD1, DSD2, DSD3, JOB1, and JOB2 text are actually placeholders for my RPG program to search for and replace with data from a file. You can use any string you like for the placeholders, of course; I just think these are easy to remember.

The DAYOFWEEK text is also a placeholder—useful, for example, if you release new price lists on Tuesdays and Fridays.

This .docx file is named "Price List Template.docx" because it is my template for creating the price lists. I unzip the template, insert my data into it, and zip it back up with a different name and store it in the IFS, where users will take it and read it when they want to know the prices.

Inserting the Data from RPG

In order to insert my data, my RPG program does the following:

  • Unzip the template to a temporary IFS directory.
  • Set the CCSID of the file to 1208, so translation works properly.
  • Open the IFS file with the open() API.
  • Read the XML data into memory (I let the IFS API convert it to EBCDIC).
  • Scan for the first </w:tr> tag. This is the end of the first row of the table, which contains the column headings, and shouldn't be changed.
  • Everything from the start of the document to the first </w:tr> tag is the "header" information and should be written once to the top of my output XML document.
  • I replace DAYOFWEEK in the header information with the actual day of the week before I write it.
  • Everything up until the next </w:tr> tag is the first row (the shaded row) of my table. I save that in a string.
  • Everything up until the last </w:tr> tag will be the unshaded row, I save that to a string as well.
  • I loop through my price list file. For each row, I scan and replace the placeholders with actual data from my file and write it to disk.
  • After the whole price list is written, I copy the rest of the XML data from the word/document.xml file to my output file, as is, to complete the document.
  • I zip the data back up into a .docx file with a new name in the IFS.

To demonstrate this technique, I created a program named WORDTABLE. It calls QShell and PASE utilities to unzip the file and uses the IFS APIs to read the file into memory, as described above. The resulting Word document looks like this:


(Click image for a larger view)

Rather than try to include the RPG code in this article, I've provided the complete WORDTABLE source code here for download.

InfoZip vs. Java's jar Utility

Q: Did you try the QShell jar utility to create the zip file? (I'm still using Office 2003, or I'd try it myself.) If the zip that jar creates is acceptable to Word, you wouldn't need any extra products to deal with zip and unzip.

A: I must admit, I'm not a fan of using jar for Zip files. I learned to use it in the 1990s when I learned Java, but I found it slow and much more limited than InfoZip.

However, after receiving your question, I tried modifying WORDTABLE to use jar instead of InfoZip, and it works! It's slower than InfoZip (as expected) but may save you from needing to install a separate tool.

If you download the WORDTABLE source code (link above) you can compile it to use either InfoZip or jar. For jar support, compile the program normally:

   CRTBNDRPG WORDTABLE DBGVIEW(*LIST)

Or, if you have InfoZip available, and you prefer it (like I do), you can compile WORDTABLE to use InfoZip instead by compiling it with the USE_INFOZIP compiler condition set. For example:

   CRTBNDRPG WORDTABLE DBGVIEW(*LIST) DEFINE(USE_INFOZIP)

QShell Messages on the Screen

Q: When I run your sample program from "How to Create a Word Document in RPG," it displays a bunch of unzip/zip messages and asks the user to press a key. Is there a way to avoid that? If an error occurs in QShell, how would I know?

A: QShell has environment variables named QIBM_QSH_CMD_OUTPUT (controls where QShell messages go) and QIBM_QSH_CMD_ESCAPE_MSG (controls how errors are reported to your program). In the code for this article (WORDTABLE), I have implemented these variables with the following RPG code:

     D putenv          PR            10i 0 ExtProc('putenv')
     D   nameval                       *   value options(*string)
       .
       .
       putenv('QIBM_QSH_CMD_OUTPUT=FILEAPPEND=/tmp/wordtable.log');
       putenv('QIBM_QSH_CMD_ESCAPE_MSG=Y');

This means that any output from the zip/unzip/jar tools is added to a log file named /tmp/wordtable.log in the IFS. If any errors occur during the QShell command, an *ESCAPE message (a "halt") is sent to the RPG program. If you run the RPG program and get a QSHxxxx error, simply look at the /tmp/wordtable.log file to see the error messages. This way, none of the informational messages are printed on the user's display. You could add the preceding code to the WORDREPL example from my previous article as well if you want to disable the QShell output in that sample program.

Thanks for all the questions!


Want to use this article? Click here for options!
Want to subscribe? Click here!
  • venkateshwar.venkataraman@allianz.ie
    10 months ago
    Jul 19, 2011

    Scott,
    Great article ... this has got potential and I must say that this has been one of the best articles from you in recent times.

    Venky

  • tips@scottklement.com
    11 months ago
    Jun 27, 2011

    hello,



    Back when I started the HTTPAPI project, I copied IFSIO_H into that project. Since then, the two copies of IFSIO_H have been maintained individually, so there may be differences between them.



    The one in this article is the "current" one, and the most complete one as of the date of the article. I suggest that you only use the one in HTTPAPI for compiling HTTPAPI itself, and you use the one included here for any other projects.

  • AS400doofus
    11 months ago
    Jun 24, 2011

    Thanks for another great article Scott. The code for "Build a Word Document from RPG with a Table" comes with an different version of IFSIO_H. I think our current version came from HTTPAPI. Do you maintain a current version of this file? Is it usually backwards compatible?

  • You must log on before posting a comment.

    Are you a new visitor? Register Here
     

    around the forums

    PASE - HTMLDOC (Scott's binary version) Error: please Help!
    Forum Name: RPG
    16 May 2012 01:58 PM | Replies: 3
    IFS directory structure
    Forum Name: Systems Management
    16 May 2012 11:52 AM | Replies: 2
    IFS folder/file authority
    Forum Name: Communications/Networking
    16 May 2012 08:45 AM | Replies: 6

    ProVIP Sponsors

    BCD

    Join Our Community!

    Subscribe today to iPro Developer! iPro Developer is packed with technical know-how for developers of IBM i, iSeries, AS400 and System i. Sign up now to get your full subscriber benefits including:

    • Code available for download
    • Full access to the online article archive (including all System iNEWS ProVIP content)
    • Downloadable ebook with past 6 months of articles
    • Discounts on eLearning classes, self-paced training, in-person events, and more!
    iPro Developer Newsletters
    • Get the Latest News
    • Product Updates
    • Helpful Tricks
    • Productivity Tips