Q: I read your article "How to Create a Word Document in RPG" (March 10, 2011, article ID 65833). Great article! Works like a charm for single values in a form-letter approach. But how would you insert a list of data? For example, how can I insert a table containing a price list for a customer, when the number of items isn't known at compile time, so a simple find and replace won't work?
A: To do that, you need to look at the structure of the XML document and write the appropriate XML to build the table. In this article, I demonstrate how to do that.
Review
In the previous article, I explained that .docx files are actually .zip files, and inside the file is a directory structure, containing multiple XML files and potentially other types of files as well.
Although there's a lot of information in a Word document, the core data of the document itself is kept in a single XML file stored in the word/document.xml file inside the archive.
This word/document.xml file is kept in UTF-8 (CCSID 1208) format. If you unzip the .docx file, you can open the word/document.xml file by using the IFS APIs, and you can manipulate its contents any way that you like.
Structure of a Table in Word
Personally, I find the XML inside a Word document to be a bit overwhelming because there are so many details stored in a Word document, and therefore there are so many tags, that I quickly get lost. I don't attempt to fully understand a Word document, but instead, I start by creating a document inside Word itself, and then I just add my data inside of it. So I don't need to fully understand the XML, but I need to understand some of it because it'll have to be repeated for each row I want to add to the table in my Word document.
To help you get the basic idea of the structure of a word document, start by looking at a very simple one (without a table):
<w:document>
<w:body>
<w:p>
<w:r>
<w:t> text data </w:t>
</w:r>
</w:p>
</w:body>
</w:document>
This is a Word document that contains the words 'text data', and nothing else. The following are the XML elements in the document:
| w:document |
The entire Word document is always inside this tag. |
| w:body |
The body (or data) of the document. |
| w:p |
A paragraph |
| w:r |
A "run" of text within a paragraph. |
| w:t |
This is the text. Always occurs inside a <w:r> tag. |
When a there's a table, there's a <w:tbl> tag inside the body of the Word document. The paragraph, run, and text data are located inside the cells of the table. The basic structure of a table looks like this:
<w:tbl>
<w:tr>
<w:tc> ... row 1, column 1 ... </w:tc>
<w:tc> ... row 1, column 2 ... </w:tc>
</w:tr>
<w:tr>
<w:tc> ... row 2, column 1 ... </w:tc>
<w:tc> ... row 2, column 2 ... </w:tc>
</w:tr>
</w:tbl>
| w:tbl |
identifies a table |
| w:tr |
identifies a table row |
| w:tc |
identifies a cell within a row |
Within each of these cells will be the paragraph, run, and text elements. There can also be other elements that describe the formatting of the data, the size of the columns, and other table, row, column, paragraph, run, and text properties. Properties are identified by the tag name followed by Pr, so you might have:
| w:tblPr |
table properties |
| w:trPr |
row properties |
| w:tcPr |
cell properties |
| w:pPr |
paragraph properties |
| w:rPr |
run properties |
| w:tPr |
text properties |
These are only some of the XML tags that can exist within the document, and so far my document is already getting cluttered and hard to read because there are so many details in it. If you've followed me so far (and I wouldn't blame you for being lost, but please bear with me), the tags fit together like this:
<w:document>
<w:body>
<w:bodyPr> .. body properties .. </w:bodyPr>
<w:tbl>
<w:tblPr> .. table properties .. </w:tblPr>
<w:tr>
<w:trPr> .. row properties .. </w:tblPr>
<w:tc>
<w:tcPr> .. column properties .. </w:tcPr>
<w:p>
<w:pPr> .. paragraph properties .. </w:pPr>
<w:r>
<w:rPr> .. text run properties .. </w:rPr>
<w:t>
<w:tPr> .. text properties .. </w:tPr>
...... your cell's text data goes here ......
</w:t>
</w:r>
</w:p>
</w:tc>
.. more columns go here ..
</w:tr>
.. more rows go here ..
</w:tbl>
</w:body>
</w:document>
That's only enough XML for a single cell of a table. You see what I mean? It gets overwhelming. So I'm going to create my document in Word rather than try to generate all the proper XML in my program. It's just easier that way, but then I'll find the rows and columns in the table, and I'll duplicate them with my data.
Creating a Word Document Template
Because I've decided to let Word generate the XML, I fire up Word 2007 (yes, I'm still using 2007!) and create a template for my price list. I know my RPG program will need to fill in the price details, so in Word I put placeholders where I want the RPG program to insert data.
Here's what I came up with for my price list:

(Click image for a larger view)
It's a very simple Word document, and it took me only a few minutes to make—but, the first thing I notice is how much nicer it looks than my traditional RPG reports. I can see why you asked this question!
In my shop, each item that we sell is called a "product," and this particular price list will contain prices for Direct Store Delivery (DSD) in zones 1, 2, and 3, as well as Distributer/Jobber (JOB) pricing for zones 1 and 2. I'm not certain how other people's price lists are organized, but that's the way we organize ours where I work, at Klement's Sausage. You can adapt it according to your own needs and tastes, of course.
I decided to have a light blue background behind every alternating row—much like the blue bar paper we printed on years ago. It makes it a little easier for people's eyes to follow across the rows. That means my template needs a "shaded" row and an "unshaded" row. These will, of course, correspond to <w:tr> elements in the resulting XML document.
The PROD, DSD1, DSD2, DSD3, JOB1, and JOB2 text are actually placeholders for my RPG program to search for and replace with data from a file. You can use any string you like for the placeholders, of course; I just think these are easy to remember.
The DAYOFWEEK text is also a placeholder—useful, for example, if you release new price lists on Tuesdays and Fridays.
This .docx file is named "Price List Template.docx" because it is my template for creating the price lists. I unzip the template, insert my data into it, and zip it back up with a different name and store it in the IFS, where users will take it and read it when they want to know the prices.
Inserting the Data from RPG
In order to insert my data, my RPG program does the following:
- Unzip the template to a temporary IFS directory.
- Set the CCSID of the file to 1208, so translation works properly.
- Open the IFS file with the open() API.
- Read the XML data into memory (I let the IFS API convert it to EBCDIC).
- Scan for the first </w:tr> tag. This is the end of the first row of the table, which contains the column headings, and shouldn't be changed.
- Everything from the start of the document to the first </w:tr> tag is the "header" information and should be written once to the top of my output XML document.
- I replace DAYOFWEEK in the header information with the actual day of the week before I write it.
- Everything up until the next </w:tr> tag is the first row (the shaded row) of my table. I save that in a string.
- Everything up until the last </w:tr> tag will be the unshaded row, I save that to a string as well.
- I loop through my price list file. For each row, I scan and replace the placeholders with actual data from my file and write it to disk.
- After the whole price list is written, I copy the rest of the XML data from the word/document.xml file to my output file, as is, to complete the document.
- I zip the data back up into a .docx file with a new name in the IFS.
To demonstrate this technique, I created a program named WORDTABLE. It calls QShell and PASE utilities to unzip the file and uses the IFS APIs to read the file into memory, as described above. The resulting Word document looks like this:

(Click image for a larger view)
Rather than try to include the RPG code in this article, I've provided the complete WORDTABLE source code here for download.
InfoZip vs. Java's jar Utility
Q: Did you try the QShell jar utility to create the zip file? (I'm still using Office 2003, or I'd try it myself.) If the zip that jar creates is acceptable to Word, you wouldn't need any extra products to deal with zip and unzip.
A: I must admit, I'm not a fan of using jar for Zip files. I learned to use it in the 1990s when I learned Java, but I found it slow and much more limited than InfoZip.
However, after receiving your question, I tried modifying WORDTABLE to use jar instead of InfoZip, and it works! It's slower than InfoZip (as expected) but may save you from needing to install a separate tool.
If you download the WORDTABLE source code (link above) you can compile it to use either InfoZip or jar. For jar support, compile the program normally:
CRTBNDRPG WORDTABLE DBGVIEW(*LIST)
Or, if you have InfoZip available, and you prefer it (like I do), you can compile WORDTABLE to use InfoZip instead by compiling it with the USE_INFOZIP compiler condition set. For example:
CRTBNDRPG WORDTABLE DBGVIEW(*LIST) DEFINE(USE_INFOZIP)
QShell Messages on the Screen
Q: When I run your sample program from "How to Create a Word Document in RPG," it displays a bunch of unzip/zip messages and asks the user to press a key. Is there a way to avoid that? If an error occurs in QShell, how would I know?
A: QShell has environment variables named QIBM_QSH_CMD_OUTPUT (controls where QShell messages go) and QIBM_QSH_CMD_ESCAPE_MSG (controls how errors are reported to your program). In the code for this article (WORDTABLE), I have implemented these variables with the following RPG code:
D putenv PR 10i 0 ExtProc('putenv')
D nameval * value options(*string)
.
.
putenv('QIBM_QSH_CMD_OUTPUT=FILEAPPEND=/tmp/wordtable.log');
putenv('QIBM_QSH_CMD_ESCAPE_MSG=Y');
This means that any output from the zip/unzip/jar tools is added to a log file named /tmp/wordtable.log in the IFS. If any errors occur during the QShell command, an *ESCAPE message (a "halt") is sent to the RPG program. If you run the RPG program and get a QSHxxxx error, simply look at the /tmp/wordtable.log file to see the error messages. This way, none of the informational messages are printed on the user's display. You could add the preceding code to the WORDREPL example from my previous article as well if you want to disable the QShell output in that sample program.
Thanks for all the questions!