There is thorough checking done by the reporting system inside the utility
"lis2ps" so this problem will be spotted and reported if it happens. I
wrote a utility "pstolp" (the opposite of "lptops") that strips out the
text from a PostScript file to turn it back into plain text and then this
file gets compared to the original text file converted by "lptops". It
will report any differences, if it finds any, and will explain the problem.
"pstolp" is written in gawk and it slightly gawk version dependent. This
is explained in the header. You can view this script below.
pstolp
(At the time I developed Spectre, "lptops" was the only utility I could find that you could specify individual margins for as well as be able to adjust the spacing between lines. Since then the "a2pdf" utility (not written by me) has these options. By default, Spectre uses 1.1 ratio line spacing instead of the more usual 1.2 because Courier font characters are not as tall as most other fonts and a line spacing of 1.1 looks better plus it allows more lines to be fitted on a page. "lptops" works very well and there is every indication that it will continue to be available, but in case it should ever disappear, I will convert over to "a2pdf", so do not be concerned on the reliance on "lptops".)
The job of writing these macros used to be done by a senior programmer
but since _ptlibref_ has been added as a global macro variable then
the Spectre administrator should set up these two macros to make
sure _ptlibref_ has not been forgotten. If it isn't set up then
many of the Spectre scripts that call sas will fail. You might have low
level macros to do this or high level macros. I use low level macros for
this such that one has to be written for every study increment. You can
see two examples of them below.
%allocr
%allocw
cp -p $(myfiles *.titles) /old/inc /new/inc |
Derived dataset program names
If the reporting system is being used for the build of the derived datasets then the numbering convention can go out of control. To recap, the derived datasets will be built starting with a cut down version of an acct dataset having one observation per subject. What this is called will vary from site to site. Suppose this cut down version is named acct0 and it is a stats dataset then the recommendation for the program name is s10_acct0.sas . The number "10" acts as a sort key that Spectre uses to run the programs in the specified order. The other derived dataset build programs will use this acct0 dataset as part of the build process and it is intended that the stats programs start with s11_ and the derived dataset programs start with d11_ . The rest of the name should match the name of the dataset being built such that the stats adverse events program building the dataset adv will have the name s11_adv.sas . After the derived datasets have been built it is normal to create a full acct dataset. This will be based on the cut-down acct0 dataset but with information from the other derived dataset added. The recommended name for this is s20_acct.sas assuming the final dataset name is acct. Sometimes the numbering has to go above "11" if a dataset build program needs the results from another dataset build program. Suppose the adverse events dataset required dosing values to be merged in so that you knew what dose a patient was on when they had an AE. If the dose build program was named s11_dose.sas then the adverse event build program would have to use a number higher than "11" such as s12_adv.sas . With programmers working independently, this numbering of programs can sometimes get out of control. Also, if they write the specifications for their own datasets, they might decide they want a field from the final acct dataset included so they will use a number higher than that used for s20_acct.sas . The situation can get ridiculous but confusion can be avoided if two rules are followed.To track this you can use the "derorder" script. This will identify all those programs that match the pattern for a derived dataset build program. This is the script that will be called when running these programs to identify the list of programs to be run and in what order. Just type in the command from the programs directory.
- Only efficacy stats datasets are allowed to contain fields that are in the "acct" dataset that are not in the "acct0" dataset and only they can have a number greater than the s20_acct.sas program.
- Programmers must use the lowest number allowed for their derived dataset build programs.
derorder Another thing you might notice is that some programmers have old version of their programs such as d11_adv_old.sas . This is a mistake as this program matches the pattern and it too will be run. Old versions should have names like d11_adv.old so that they do not match the pattern.
Report program names
Programmers using the reporting system tend to work at a fast pace. Once they have a reporting program they will often makes copies of it to use for different populations (like "safety", "full analysis set" and "per-protocol") and different age groups. If their naming convention is wrong for the first program then this error will get propagated to the copies. If they can spot this early then some time will be saved. The mistake will be easiest seen in their titles members. You can use the "myfiles" command for a specific user so to list the titles members to "rrash" you could use this command:
myfiles -u rrash *.titles It would be a good idea to check on this from time to time and inform the programmers if their file names should be changed. It should not be a lot of work for them as they can use "grename" to rename their files, "mybadtitlesprogs" to then list where the program name is different to the titles member file name (using the "-f" option will fix it for them) and if their headers follow the recommended convention then "mybadprognames" can both identify and fix where the program name does not match the file name.
If the reporting system is being used to create the derived datasets
and you want to run all these programs plus the reporting programs then
you can do this using the command fullrunsuite.
fullrunsuite |
If you only want to run the derived dataset build programs or the report
programs then first you have to generate the scripts to do this using the
command makerun.
makerun |
Then you can run whatever set of programs you wish to run using the
generated scripts "runderived" or "runreports". Whichever you choose to
run, you should redirect standard error to a file. To run the derived dataset
programs use this command from the programs directory:
./runderived 2>runderived.err |
To run the reports, use this command from the programs directory:
./runreports 2>runreports.err |
You should check the "err" files at the end for errors. If you want to see only those programs that encountered some sort of problem then use the script "runproblems" to display these. Note that when an error is encountered for the derived dataset build, the programs following it are aborted. This is not the case for the report programs as they have no dependency on each other.
Once you have made your donesect files, these have to be checked again
the original donelist.txt file to make sure you included all the entries
and have not repeated any. You do this using the script donesectchk.
It will tell you if it detects any problems.
donesectchk |
Here is how you would use this script to create PostScript files and
PDFs for two donesect files. "a4ps2pdf"
is used to create the PDF files. If your paper size was US Letter then
"usps2pdf" would be used
instead.
bigps -n 1 > BIG1.ps
a4ps2pdf BIG1.ps bigps -n 2 > BIG2.ps a4ps2pdf BIG2.ps |
When "a4ps2pdf" (or "usps2pdf") creates a PDF file it gives it the name name as the original file but with the file extension ".pdf" so the two PDFs created in the above case would be "BIG1.pdf" and "BIG2.pdf".
printpdfbookmarks BIG1.pdf > bookmarks1.txt |
titlesvsbkmarks BIG*.pdf |
mkdir status060217
cp -p $(donefiles) status060217 cp -p donelist.txt status060217 cp -p runreports.err status060217 cp -p runreports.chk status060217 cp -p ALL.TITLES status060217 |
Now that you have the backup, it is important to know if any of the
numbers in the reports have changed since the previous run. You should
use the script "ddiff" for this. It compares the contents of every file
(that matches a pattern you supply) in the current directory to
identically named files in another directory. Since there might be a date
(and time) change in one line of every output you can specify this to the
"-i" (ignore) option so that these lines are not compared. Suppose this
line with the changing date (and time) started with the word "Report" and
suppose the previous run was on the 05 Feb 2006 then you could make a comparison
and store the results like this:
ddiff -i ^Report -d ../status060205 *.lis* > ddiff_060217_vs_060205.txt |
Using the "ddiff" script should become part of your QC process.
You should think about including its use in your SOPs. As an administrator,
you will need to run "ddiff" in this way and be aware of any number differences
found in any of the output. Someone will need to know about these differences
if it involves numbers changing. You can learn more about "ddiff" from
its header.
ddiff
You will see some of the code below used in the data _null_ step. It
is writing out lines of postscript code. Don't ask me to explain how postscript
code works, because I do not have a clue, but what the following code does
is add four lines after the first line in the postscript file. The first
two lines act as a warning to a postscript reader to ignore the instruction
"pdfmark". Most printers are postscript readers so this tells the printer
not to do anything if it encounters the "pdfmark" instruction (like print
it out). The last two lines add the bookmark. The text of the bookmark
should be in round brackets and should not be in quotes. &_figbkmark_
gets resolved into the bookmark text in this case. If you edit a postscript
file to add these lines then do not put the lines in double quotes as in
the code below. These quotes are there to get the program syntax correct.
And if you edit a file then copy and paste what you see below within the
double quotes. Do not think in code terms and try to combine two lines
into one. Just copy it exactly as it is. Note that two sets of brackets
are curly brackets but round brackets are used to contain the bookmark
text. If you want to check that you have done it correctly then open up
the postscript file in "gv" after you have edited it. It will soon tell
you if there is a problem, though you may not understand its explanation.
If all is well then the bookmark should be there when you convert the postscript
file to a pdf.
%if %length(&_figbkmark_) %then %do;
if _n_=1 then do; put _infile_; put "/pdfmark where"; put "{pop} {userdict /pdfmark /cleartomark load put} ifelse"; put "[ /Title (&_figbkmark_)"; put " /OUT pdfmark"; end; else put _infile_; %end; |
Use the "Back" button of your browser to return to the previous page.