While participating in the NEUGC conference last week, I managed to pop in to George Farr's session that covered the new features in RPG in version 7.1. In this article, I provide an overview of these features and why you might want to use them.
The Server-Side Development Tools Renamed
The licensed program offering that provided the RPG, Cobol, C, and C++ compilers, as well as the server-side Application Development Toolset was previously known as WebSphere Development Studio for IBM i (WDS). The name has changed with 7.1, and it will now be known as Rational Development Studio for IBM i (RDS). This change was presumably done because the software is no longer under IBM's WebSphere brand name, because it is now developed as part of Rational.
More interesting (and perhaps controversial?) is that RDS has stabilized the ADTS part of the offering, which includes PDM, SEU, SDA, and RLU. This means that there will be no further enhancements to these products.
Here's an overview of RDS:
| Product |
Option |
Description |
Note |
| Application Development Toolset (ATDS) |
5770-WDS, opt 1 |
PDM, SEU, SDA, RLU, DFU, CGU, ISDB, FCMU |
Stabilized |
| Heritage Compilers |
5770-WDS, opt 2 |
OPM, Sys/36 and Sys/38 compilers for RPG and Cobol |
Stabilized |
| ILE Compilers |
5770-WDS, opt 3 |
ILE RPG, Cobol, C, C++, and IXLC for C/C++ |
|
Here's what's really interesting: SEU's syntax checking has been frozen at the 6.1 level! That means that SEU will not recognize any of the 7.1 enhancements and will detect these new features as errors.
What fascinates me about this "6.1 freeze" is that IBM actually had to enhance SEU to accommodate it. You see, SEU called routines in the compiler for its syntax checking (it wasn't SEU itself that did the checking.) So IBM had to add additional functionality to tell the compilers that it was an old product requesting the feature, so the compiler could freeze the syntax checking to 6.1 level.
This is part of IBM's effort to make you use its newer tool set for editing your code. If you want syntax checking on the new features, you must run Rational Developer for Power Systems Software—RPG and Cobol Development Tools (that's the new tool that has replaced RDi and WDSC). Unfortunately the name is a bit ungainly, so most people are abbreviating "Rational Developer for Power Systems Software—RPG and Cobol Development Tools" as simply "RDP."
So if you're still using PDM and SEU, IBM is trying to push you to RDP. I can't say that I'm happy about this!
Implicit Unicode Conversion for Parameters
RPG has supported Unicode via its UCS-2 data type (letter C) for quite a while now. In 7.1, IBM sweetened the deal a bit by doing implicit conversion when you pass parameters using prototypes.
For example:
Implicit Unicode Conversion for Parameters
Imagine a subprocedure like the following one:
P JoinAddress B
D JoinAddress PI 120c
D Name 30c const
D Addr1 30c const
D Addr2 30c const
D Addr3 30c const
Each of the parameters to this procedure are defined as 30 long, and data type C, which means they are Unicode fields. Nothing new about that; we've had that support for a long time now. However, prior to 7.1, if you wanted to pass an EBCDIC string as input, you had to do this:
Output = JoinAddress( %ucs2(myName)
: %ucs2(myAddr1)
: %ucs2(myAddr2)
: %ucs2(myAddr3) );
The %UCS2 BIF was required to convert the data to UCS-2 before it was passed to the API. With 7.1, that's no longer necessary, and it'll automatically do the conversion for you:
Output = JoinAddress( myName
: myAddr1
: myAddr2
: myAddr3 );
The same is true with graphic fields as well as alphanumeric. They are automatically converted from the source type to the destination type without you needing to use the extra BIFs to do the conversion.
How useful is this? Well, that's a tricky question. My observation is that very few RPG programmers are using Unicode right now, so I don't expect this to see a lot of use at first. However, I personally think that this is an important step for RPG's future. Unicode means that you can support any character used by any of the languages in the world! No worries about whether a given character is allowed in a given CCSID. Programs written to use Unicode will work flawlessly in Europe, Asia, Africa, etc., without changes. As we move more into global applications, this is absolutely crucial.
Thanks to this auto-conversion, we can write our subprocedures so that they take Unicode parameters. Then it doesn't matter if the caller wants to pass EBCDIC (alphanumeric) or Graphic or Unicode, it'll work with any of them. Makes sense to me.
Pass Return Values as Parameters
One of the more frustrating problems with RPG's subprocedures has always been the performance of the return values. Long character strings used as a return value from a subprocedure are really, really slow in RPG, and so we work around the issue by passing the output data as a parameter. Here's an example:
title = center(input);
P center B
D center PI 65535a varying
D text 65535a varying const
.. logic to center a string goes here ..
P E
The return value is large, 64KB of data. And in 6.1, this can be much larger . . . it can be megabytes long if you want. The problem is, it performs poorly to return that much data. The ILE implementation of return values does not handle large amounts of data efficiently. It's not a problem when your return value is only a few hundred bytes, but when it gets into the thousands or millions of bytes, this performance becomes a real difficulty.
We'd typically work around the problem by coding more like this:
callp center(input: output);
title = output;
P center B
D center PI
D input 65535a varying const
D output 65535a varying
.. logic to center a string goes here ..
P E
Although this improves the performance, it makes the routine much less flexible since it can no longer be used in an expression. This is where the 7.1 feature aims to solve the problem. In 7.1, IBM added the RTNPARM keyword. With this keyword, the return value will be converted to a parameter under the covers. You can still use your procedure in an expression as you normally would, and RPG takes care of passing the return value as a parameter under the covers for performance's sake.
title = center(input);
P center B
D center PI 65535a varying
D RTNPARM
D text 65535a varying const
.. logic to center a string goes here ..
P E
Since this changes the way parameters are passed under the covers, it's really only useful when your subprocedure is called from another RPG routine. It'll be tricky to call a RTNPARM routine from the other ILE languages, so you'll need to be careful when to use RTNPARM and when not to.
But it does provide a welcome performance boost!
Prototypes Are No Longer Required Within the Module
One of the more frustrating things about subprocedures in RPG is the requirement that every PI matches a PR. This meant you always had to code identical PI and PR, even though they said the same thing. For example:
X = MyProc( Y );
.
.
P MyProc B
D MyProc PI 10i 0
D parm1 12a const
.. do something ..
P E
RPG will allow MyProc() to be called without an explicit prototype, because it can find the PI in the same module. This also applies to PR/PI used as a 21st century replacement for *ENTRY PLIST.
Prior to V3R2, I had to code *ENTRY PLIST like this:
C *ENTRY PLIST
C PARM Parm1 10
C PARM Parm2 9 0
This *ENTRY PLIST syntax was clunky and didn't work in free format. In V3R2 we got support for prototypes and could change it to use a PR/PI. This support became much more popular (but wasn't new) in V5R1 when free-format RPG arrived, since *ENTRY PLIST didn't work in free format. With the PR/PI method, it would look like this:
D MYPGM PR ExtPgm('MYPGM')
D Parm1 10a
D Parm1 9p 0
D MYPGM PI
D Parm1 10a
D Parm1 9p 0
This would be coded at the start of my program (before any P-specs) to indicate that it worked like *ENTRY PLIST and received parameters for my program rather than a subprocedure. It does exactly the same thing as *ENTRY PLIST but works with a bunch of keywords, allowing me to do a bit more with it. Plus, since it's on the D-spec, it doesn't force me to use fixed-format code.
But I found it cumbersome to use because I had to code the whole thing twice each time. Both a PR and PI, with identical parameters!
Indeed, some of the other RPG programmers I talked to told me that they always used the fixed-format syntax (*ENTRY PLIST) because they hated coding it twice . . . and that was the only reason they used fixed format!
So with 7.1 that restriction is gone. You can now do it only once, like this:
D MYPGM PI ExtPgm('MYPGM')
D Parm1 10a
D Parm1 9p 0
This seems like a small thing, but it'll save me a great deal of time and hassle, because I have to code this sort of thing so often. It will definitely be a godsend! Though, I don't really understand why I need the ExtPgm. Isn't that a little weird?
Tell Me the Maximum Size of a VARYING Field
We have lots of BIFs, including %SIZE and %LEN, that tell you the size of a character string. %SIZE tells you how many bytes of memory something occupies, while %LEN tells you how many characters long it is.
But what do you do if you have a VARYING field and you want to know the maximum size that can be stored in the field?
In 6.1 or earlier I'd have to code this:
D MyAlpha s 100a varying
D MyUnicode s 100c varying
/free
maxlen = %size(myAlpha) - 2;
... or for Unicode ...
maxlen = %div(%size(myUnicode) - 2 : 2);
The %size() BIF tells me the size of the field in bytes of storage. This includes the extra two bytes that RPG uses to manage the length of the variable under the covers. So to calculate the maximum length—the most data that can fit in a field—I have to subtract two from the allocated size. Or, for a Unicode field, I have to subtract two, then divide by two because each character takes up two bytes of memory.
Even worse, starting with 6.1, the bytes that RPG uses under the covers to manage the length might sometimes be two bytes and sometimes be four bytes. Ugh!
I realize I'm not the average programmer, but this is something that I personally have to cope with every day, so I'm very glad to see it solved in 7.1. In 7.1, I can do this instead:
maxlen = %len(myAlpha: *MAX);
-or-
maxlen = %len(myUnicode: *MAX);
Much nicer.
Eliminate Hard-Coded Parameter Numbers
If you've ever used *NOPASS parameters in RPG (and you should be using them, they're extremely useful), I'm sure you've found yourself writing code like this:
P MyProc B
D MyProc PI
D foo 1a const
D bar 10a const options(*nopass)
D blug 9p 2 const options(*nopass)
D myBar s like(bar) inz('DFTVAL')
D myBlug s like(blug) inz(-1)
/free
if %parms >= 2;
myBar = bar;
endif;
if %parms >= 3;
myBlug = blug;
endif;
You always have to check to see if a *NOPASS parameter is passed. And the way you do that is by checking to see if the parameter count (in %PARMS) is high enough to include each parameter.
However, these numbers are typically hard-coded. If you add more parameters, you have to make sure you adjust the parameter numbers accordingly. It can get ugly, especially if you need to check for optional parameters later in the routine.
With 7.1, you can check these parameters by name rather than number. IBM provides a new BIF named %PARMNUM that takes the parameter name as input and returns the number as output. So you can now code this:
if %parms >= %PARMNUM(bar);
myBar = bar;
endif;
if %parms >= %PARMNUM(blug);
myBlug = blug;
endif;
This is another thing that may seem small, but I write code like this every day, and this makes my code much more readable. I'm happy about this one!
The %PARMNUM BIF might also be used with the Operational Descriptor APIs (CEEGSI or CEEDOD) to indicate which parameter you'd like to get the descriptor for. I do that much more rarely, but when I do, I'll be happy to eliminate the hard-coded number.
New Scan/Replace BIF (%SCANRPL)
I've participated in several forum discussions about how to find and replace a character string in RPG. At first people want to use the %REPLACE BIF but can't figure out how to find and replace a value with %REPLACE. The typical sample code looks like this:
X = %scan(Needle: Haystack);
dow X > 0;
Haystack = %replace(Replacement: Haystack: X: %len(Needle));
X = %scan(Needle: Haystack);
enddo;
Sure, this works, more or less. But it's not as simple as RPG programmers want it to be. The biggest problem is that the %REPLACE BIF doesn't replace one string with another. Instead, it replaces a given substring (identified by start position and length) with a new value. This means that if you want to find and replace you have to use %REPLACE in conjunction with %SCAN and call them in a loop as I did above.
Bugs with this type of loop are common. For example, the above code would get stuck in the loop forever if Replacement happened to contain the characters in Needle.
Sure, you can code around this, but it gets tricky. And it's cumbersome to do this every time you want to find and replace a string. Why can't life be easy, you ask? Well it can in 7.1, if you use the %SCANRPL BIF.
Haystack = %scanrpl( Needle: Replacement: Haystack );
This is especially useful if you find that RPG's concatenation can sometimes be ugly. For example, you might have something like this:
msg = 'Customer ' + %char(CustNo) + ' not found!';
If you find that cumbersome, you might use %SCANRPL instead:
format = 'Customer &1 not found!';
msg = %scanrpl('&1': %char(CustNo): format);
%SCANRPL will replace all occurrences. So the following would also work:
format = 'Customer &1 not found! Where is customer &1?';
msg = %scanrpl('&1': %char(CustNo): format);
Sorting and Searching Data Structure Arrays
With V5R2, we gained the ability to put the DIM keyword on a qualified data structure, allowing the whole data structure to be treated as as an array. This result was similar to a Multiple Occurrence Data Structure (MODS) but is more elegant. And it allows you to nest arrays within other arrays, making it possible to have multiple dimensions.
D Items ds qualified
D dim(999)
D ItemNo 5p 0
D Desc 25a
D Unit 1a
D Qty 5p 0
D Price 5p 2
In this example, I've put together an array that lets me store all of the line items on one of our invoices. Our shop allows up to 999 items on an invoice, and each line item has an item number, description, unit of measure, quantity, and price.
This support worked nicely in V5R2 but didn't let you use RPG's %LOOKUP BIF or the LOOKUP and SORTA opcodes. This led to a bit of frustration, because APIs had to be used to do the sorting and searching.
Because of this lack of sorting/searching support, many people used OVERLAY arrays (arrays where subfields overlay the main array field in a data structure), which worked OK but weren't as elegant in syntax and didn't allow nesting.
In 7.1, IBM fixed this option by enhancing SORTA and %LOOKUP. All you have to do is specify the array with (*) to indicate which array to search.
// sort by name
SORTA Items(*).ItemNo;
// sort by description
SORTA Items(*).Desc
If you have a DS array with multiple dimensions, you always specify the array that you're sorting by specifying (*), and the other arrays must specify specific elements. For example:
D Sales_t ds qualified
D Month 2p 0
D MonthName 9a
D Dollars 9p 2
D Territory ds qualified
D dim(26)
D TerrId 1a
D Name 20a
D Sales likeds(Sales_t) dim(12)
This DS has 26 dimensions, each one representing a sales territory. Each territory has a territory ID code, a descriptive name, and a 12-month summary of its sales. Sales are implemented as an array within an array—or a second dimension, if you will.
To sort my array by TerrId, I'd do the following:
SORTA Territory(*).TerrId;
To sort the first territory's sales by month, I'd do the following:
SORTA Territory(1).Sales(*).Month;
Or if I wanted to sort it by the dollars sold, I might do this instead:
SORTA Territory(1).Sales(*).Dollars;
If I wanted to sort all the months by their dollar amounts, I'd probably have to do this:
for x = 1 to %elem(Territory);
SORTA Territory(x).SaleS(*).Dollars;
endfor
Likewise, if I wanted to search for the sales for territory R, I could use %LOOKUP to find it:
X = %lookup( 'R': Territory(*).TerrId );
I'm glad to see that data structure arrays are now more powerful than overlay arrays, because they are much nicer to work with in other ways. They are also a lot easier to use as parameters to subprocedures.
SORTA Can Now Do Both Ascending and Descending
The SORTA opcode has been well established as the way that RPGers sort arrays. It's simple and easy to use and runs faster than any of the APIs. However, you always had to hard-code whether the sort was ASCENDing or DESCENDing on the D-spec for the array.
For example, we had to do this:
D Sales s 9p 2 dim(12) ASCEND
/free
sorta Sales;
If you wanted to be able to sort the data both ways in the same program (i.e., both ascending and descending), you had to do a bit of fooling around. For example, you could use a pointer like this:
D Sales s 9p 2 dim(12) ASCEND
D Sales2 s like(Sales)
D dim(%elem(Sales)) DESCEND
D based(p_Sales2)
/free
p_Sales2 = %addr(Sales);
if (want_ascending);
sorta Sales;
else;
sorta Sales2;
endif;
With 7.1, IBM added (A) and (D) operation extenders to the SORTA opcode. That means you can specify ascending or descending on the opcode rather than the array, which also means you can switch. For example:
D Sales s 9p 2 dim(12)
/free
if (want_ascending);
sorta(a) Sales;
else;
sorta(d) Sales2;
endif;
Support for Longer Database Field Names
For many years now, our integrated database (DB2 for i) has supported long field names. While we have traditionally limited our field names to 10 characters long, fields defined with the ALIAS keyword in DDS can be much longer. Likewise, files created with SQL can have much longer names.
Indeed, I spoke with some people who recently switched to IBM i. Their first complaint was, "Why do you always use these horribly abbreviated column names? Does your database really limit you to 10 characters?" Of course, I quickly informed them that I was advanced, and that many shops limited their field names to 6 characters! Heh, heh, heh . . .
But, they had a point. Every database software package in the world allows for long, meaningful field names like CUSTOMER_NUMBER or SHIP_TO_ADDRESS instead of shortened names like CUSTNO and STADDR. And since DB2 for i supported these long names as well, I've always felt like RPG was holding us back.
When I spoke to Barbara Morris from IBM a year or two ago, she said there wasn't much they could do. I-specs limited the length of a field name, and even for externally defined files, the RPG compiler still generates I-specs under the covers.
Fortunately, I-specs aren't the only way to read a file anymore! You can now read directly into a data structure. And thankfully, in 7.1, IBM lets you generate the structures with long names for the fields. Simply code the ALIAS keyword.
D CustRec E DS QUALIFIED
D ALIAS
D EXTNAME(CUSTFILE)
/free
chain 1234 CUSTFILE CustRec;
if %found;
myCustNo = CustRec.CUSTOMER_NUMBER;
endif;
XML-INTO Enhancements
Two new features for XML-INTO are the countprefix and datasubf options.
This support was actually provided as a PTF for 6.1 about a year ago. The same support is now shipped with 7.1, so it's considered a new 7.1 feature.
But, since it's not really new, I'm not going to describe it here. To learn more about this feature, check out the description on IBM's RPG Cafe.
Teraspace Storage Model
Both RPG and Cobol can now be compiled with STGMDL(*TERASPACE) to let them use teraspace as their default storage model. They can also be STGMDL(*INHERIT), which means that your program will use the same storage model as your caller. The default is STGMDL(*SNGLVL), which means that RPG and/or Cobol uses the standard single-level storage model as it always has.
The storage model controls how RPG allocates memory. Previously, all memory was allocated in the single-level storage model, which means that every memory allocation is limited to 16MB. With the teraspace model, allocations with the %ALLOC or %REALLOC BIF (for example) will now be able to allocate multiple terabytes of storage.
Theoretically, it also means that the RPG compiler itself uses teraspace for all the variables that it reserves. But, seeing as how the RPG compiler doesn't let you declare variables longer than 16MB, it's hard to see how this storage model is useful.
Still, ILE C has had this functionality since ILE was introduced in V2R3. It's good to see that RPG and Cobol have gained this ability as well. That leaves CL as the only ILE language that can't use the teraspace storage model.
It might be worth noting that we've been able to use teraspace in RPG since V4R4. However, that support only let you allocate memory with the teraspace APIs and manipulate the pointers in RPG. The new support switches everything in your program to teraspace automatically without needing to call special APIs.
It's hard for me to imagine very many RPGers using this support. The more exciting aspect of this for me is the doors that it might open for the RPG language in the future. For example, in a future release, maybe we won't have any limit on the size of variables or arrays? Maybe RPG will add an "expanding" data type where you don't have to define the length of a string, and RPG will automatically extend it as needed? These dreams of the future are far more interesting to me than what this storage model support provides today.
Rational Open Access: RPG Edition
The biggest RPG feature of 7.1 is RPG Open Access. Open Access (previously referred to as "Open I/O" in early discussions) makes it possible for programmers to write "handlers" that take the place of file I/O when you use RPG's native file opcodes.
The way George Farr explained it, under the covers, anytime you read or write from a file, RPG actually calls a routine in the OS to handle that request. For example, if you CHAIN to a record in a PF, RPG calls a routine in the database manager that retrieves a record by key. Similarly, when you use EXFMT to display a screen, it calls a routine in the display file management portion of the OS. So while we tend to think of these operations as "reading a file," the RPG runtime is really just calling a routine.
What IBM has done is open those routines up to you. You can now have it call a routine of your choosing rather than one provided by the OS. That way, you have full control over what happens when the program tries to read a file. Will it actually read a file? Or will your code simply calculate the value returned for the fields in the record? It's up to you.
People are excited about this new tool because it makes it possible for existing RPG code to use opcodes like EXFMT, READ, WRITE, etc., against a display file—but they can provide the routines that handle the display I/O. So the routine might decide to display a web page instead of a display file. Or it might decide to communicate with a Visual C++ program running on Windows, and that Visual C++ program might bring up a GUI window.
Third-party vendors are already providing prewritten handlers that RPG Open Access can call. So if you want your RPG programs to output to the web instead of outputting to a 5250 terminal, you can buy a "web" handler from a vendor, pop it in, and your program now goes to the web!
If, in the future, you want your output device to be an iPhone or iPad or Droid, it's just a matter of buying a new handler. The only change required to your RPG program is the F-spec, where you'll need to code the HANDLER keyword to tell it where to call the handler routine.
Unfortunately, I don't really have the space to go in-depth about RPG Open Access. That's something that could easily fill an article by itself. Or several articles! But if you'd like to see a more in-depth technical description, I suggest that you check out the technical documentation on the RPG Cafe.
RPG Open Access is not included with the RPG compiler, however. Instead, you have to purchase it separately from IBM. So, unless you plan to write your own handlers, you're going to need approval for two different costs: Open Access itself, plus whatever the vendor is charging for their handler.
On the plus side, it's available for both 6.1 and 7.1, so you don't even have to wait till 7.1 before you can try it out.
My opinion is that Open Access is overhyped. It has been represented as the greatest thing ever and the salvation of the RPG language, and in my opinion that's blown way out of proportion. All it does is enable a new way of calling subprocedures in a service program. Instead of calling them directly, you do file opcodes, and the file opcodes call the procedures.
Plus, the existing SPECIAL file support, although limited, never got a lot of adoption in the RPG community. So why come out with new tool that's almost identical to a tool that's hardly used? Is there really that much demand for it?
Furthermore, from the RPG program's perspective, it's still reading/writing a display file. Even though you may have another device on the other end, RPG doesn't know that, and so it can't take advantage of it. It can't behave like a proper web application, because it's trying to control the flow and act in a stateful manner. It can't behave like a proper GUI program, because it can't take action based on mouse events or what keystrokes a user typed into which custom controls. All it knows is how to read/write data to a display file. And we basically already had the same thing with screen scrapers, didn't we? I realize that Open Access is intercepting the program logic at a different level than a screen scraper would—but other than that, isn't it still doing the same thing? Transforming a DDS-defined 5250 screen into a GUI screen, while tricking the RPG program into thinking it's still 5250?
What do you think? Leave your comments below.