UNIX:STAT Statistics Program
Gary Perlman
perlman at giza.cis.ohio-state.edu
Tue Oct 30 16:47:50 AEST 1990
In article <1990Oct29.221237.25451 at pmsmam.uucp> wwm at pmsmam.UUCP (Bill Meahan) writes:
>In article <1990Oct25.223804.13113 at isc.rit.edu> wlw2286 at isc.rit.edu (Lance Ware) writes:
>>A few years back I had a Statistics program called UNIX:STAT, I believe,
>>but I have snce changed machines, etc, and don't hve it anymore. Does
>>anyone know where this program came from (I am pretty sure I FTP'd it)?
>>If anyone can help me locate it I'd appreciate it.
>>
>>Lance
I hope you didn't ftp it, because that would violate the license.
>
>I'd like to find it too. It was a very good package that I liked better
>than several of the expensive commercial packages around.
Well, it is certainly not expensive. You just have to pay for me to
mail you a tape or floppies and you can copy the programs to all your
machines.
|STAT (which runs on UNIX or DOS) is alive and well in Ohio.
The current release number is 5.4, dated March 1989, although
there have been several minor changes to programs since then.
Here is the standard blurb, followed by a list of change notes.
|STAT 5.4
DATA MANIPULATION & ANALYSIS PROGRAMS
FOR UNIX and MSDOS
|STAT is a set of over 20 data manipulation and analysis programs
developed by Gary Perlman at the University of California, San Diego and at
the Wang Institute of Graduate Studies. The programs are designed with the
UNIX philosophy that individual programs should be designed as tools that
do one task well and produce output suitable for input via pipes to other
programs. Interactive use is supported in the command line interpreter
which also provides a programming language for complex analyses. Functions
built into many statistical packages (e.g., graphics and editing) are not
re-invented in |STAT which delegates such responsibility to standard tools.
Typical usage involves a pipeline of transformations of data followed by
input to an analysis program, summarized schematically by:
INPUT DATA | TRANSFORM | ANALYSIS | OUTPUT RESULTS
Data Manipulation Programs:
abut join data files beside each other
colex column extraction/formatting
dm conditional data extraction/transformation
dsort multiple key data sorting filter
linex line extraction
maketrix create matrix format file from free-format input
perm permute line order randomly, numerically, alphabetically
probdist probability distribution functions
ranksort convert data to ranks
repeat repeat strings or lines in files
reverse reverse lines, columns, or characters
series generate an additive series of numbers
transpose transpose matrix format input
validata verify data file consistency
Data Analysis Programs:
anova multi-factor analysis of variance
calc interactive algebraic modeling calculator
contab contingency tables and chi-square
desc descriptions, histograms, frequency tables
dprime signal detection d' and beta calculations
features tabulate features of items
oneway one-way anova/t-test with error-bar plots
pair paired data statistics, regression, scatterplots
rankind rank order analysis for independent conditions
rankrel rank order analysis for related conditions
regress multiple linear regression and correlation
stats simple summary statistics
ts time series analysis and plots
Package Features:
simple input formats (free format field oriented)
flexible data manipulation
several simple lineprinter plotting options
data validation (range and type checking)
consistent option conventions with online help
runs on any UNIX System (V6, V7, 2.8BSD, 4BSD, System V, etc.)
runs on MSDOS 2.0 and 3.0 with 96K (IBM, Wang, AT&T, Epson, etc.)
usually less than a few seconds per analysis
liberal copyright (but can't be distributed for gain)
Notes:
UNIX is a trademark of AT&T Bell Laboratories.
MSDOS is a trademark of MicroSoft.
|STAT is NOT a product of any company or organization.
Distribution Conditions:
CAREFULLY READ THE FOLLOWING CONDITIONS. IF YOU DO NOT FIND THEM
ACCEPTABLE, YOU SHOULD NOT USE |STAT.
|STAT IS PROVIDED "AS IS" AND WITHOUT ANY WARRANTY EXPRESS OR IMPLIED. THE
USER ASSUMES ALL RISKS OF USING |STAT. THERE IS NO CLAIM OF THE
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. |STAT MAY NOT BE
SUITED TO YOUR NEEDS. |STAT MAY NOT RUN ON YOUR PARTICULAR HARDWARE OR
SOFTWARE CONFIGURATION. THE AVAILABILITY OF AND PROGRAMS IN |STAT MAY
CHANGE WITHOUT NOTICE. NEITHER MANUFACTURER NOR DISTRIBUTOR BEAR
RESPONSIBILITY FOR ANY MISHAP OR ECONOMIC LOSS RESULTING THEREFROM OF THE
USE OF |STAT EVEN IF THE PROGRAMS PROVE TO BE DEFECTIVE. |STAT IS NOT
INTENDED FOR CONSUMER USE.
CASUAL USE BY USERS NOT TRAINED IN STATISTICS, OR BY USERS NOT SUPERVISED
BY PERSONS TRAINED IN STATISTICS, MUST BE AVOIDED. USERS MUST BE TRAINED
AT THEIR OWN EXPENSE TO LEARN TO USE THE PROGRAMS. DATA ANALYSIS PROGRAMS
MAKE MANY ASSUMPTIONS ABOUT DATA, THESE ASSUMPTIONS AFFECT THE VALIDITY OF
CONCLUSIONS MADE BASED ON THE PROGRAMS. REFERENCES TO APPROPRIATE
STATISTICAL SOURCES ARE MADE IN THE |STAT HANDBOOK AND IN THE MANUAL
ENTRIES FOR SPECIFIC PROGRAMS. THE PROGRAMS HAVE NOT BEEN VALIDATED FOR
LARGE DATASETS, HIGHLY VARIABLE DATA, NOR VERY LARGE NUMBERS.
YOU MAY MAKE COPIES OF ANY TANGIBLE FORMS OF |STAT, PROVIDED THAT THERE IS
NO MATERIAL GAIN INVOLVED, AND PROVIDED THAT THE INFORMATION IN THIS NOTICE
ACCOMPANIES EVERY COPY. YOU MAY DISTRIBUTE COPIES OF |STAT, PROVIDED THAT
MASS DISTRIBUTION (SUCH AS ELECTRONIC BULLETIN BOARDS) IS NOT USED. YOU
MAY NOT MODIFY THE SOURCE CODE FOR ANY PURPOSES OTHER THAN GETTING THE
PROGRAMS TO WORK ON YOUR SYSTEM. ANY COSTS IN COMPILING OR PORTING |STAT
TO YOUR SYSTEM ARE YOUR'S ALONE, AND NOT ANY OTHER PARTIES. YOU MAY NOT
DISTRIBUTE ANY MODIFIED SOURCE CODE OR DOCUMENTATION TO USERS AT ANY SITES
OTHER THAN YOUR OWN.
Ordering Information 8/30/89:
Carefully read the instructions below. Orders not following them may
be be returned or even discarded. All prices include delivery and should
be prepaid to G. Perlman. Checks must be in US funds, drawn on a US bank.
Orders that demand any terms or conditions other than those in this notice
may be returned or discarded. Orders must include a delivery mailing label
acceptable to the post office, and international orders must include the
country name on the label.
UNIX Version of |STAT: $20/$30
Contents: Programs (C language) & Online Manual Entries
Format: half inch 9 track mag tape, 1600 bpi tar format
Format: 1/4 inch cartridge tape (this version costs $30)
MSDOS Version of |STAT: $15
Contents: Preformatted Manuals and Executables
Format: 2S/2D DOS 5.25 inch floppy diskettes
Format: 1.2 Mbyte HD DOS 5.25 inch floppy diskette
MSDOS Source: $10
Contents: C Source Code, Turbo C Project Files, Preformatted Manuals
Format: 1.2 Mbyte HD DOS 5.25 inch or 3.5 inch floppy
Handbook: $10
Contents: Examples, Reference Materials, CALC & DM Manuals, Manual Entries
Format: Typeset Manual (over 100 pages)
Gary Perlman Department of Computer and Information Science
perlman at cis.ohio-state.edu The Ohio State University
614-292-2566 2036 Neil Avenue Mall
Columbus, OH 43210-1277
Key for reading changes to |STAT:
[+] new program [-] deleted program [F] new feature
[M] modification of existing feature [R] robustness enhancements
Recent Changes
General
updated all code to be ANSI C compatible, source now available on DOS
Specific Programs
probdist: [F] added -q option for quick random number generation
features: [MR] made compatible with other |STAT program options
dprime: [FM] added -p option and new I/O formats, repaired file-input
colex: [FM] added -c option for fixed column input (like cut)
Changes for Release 5.4 March 1989
General
Added missing value (NA) handling for most analysis programs
Specific Programs
pair: [F] added correlation coefficient to plots
anova: [F] added sorting for numerical factor labels, DOS LAN support
contab: [F] added sorting for numerical factor labels, DOS LAN support
[F] added -i option to restrict # of reported interactions
ff: [F] added file statistics language to three-part titles
calc, dm: [R] added special code to fix Sun conversion software bug
features: [+] summarize features of several items
Changes for Release 5.3 January 1987
General
Random number seeding on MSDOS now uses the system clock;
it is never interactive (affects MSDOS dm, perm, probdist).
Several rank order analyses are now supported.
Specific Programs
contab: [M] removed -m option for marginal totals; they are automatic
desc [F] added standard deviations of skew and kurtosis
linex: [+] for line extraction
oneway: [M] removed -w option to request weighted means solution with
unweighted solution, -P plotwidth option now requested with -w
rankind: [+] analyze rank-order data for independent conditions
Median Test, Mann-Whitney U, Kruskal-Wallice H
rankrel: [+] analyze rank-order data for related conditions
Sign Test, Wilcoxon, Friedman, Spearman Rho
probdist: [F] binomial (b N p1/p2) distribution added
probdist: [M] output format for verbose option modified
ranksort: [F] reversal option added (-r)
repeat: [FM] added several new options, changed syntax
series: [M] minor syntax changes
stats: [M] added standard option parser and -v (verbose) option
Changes for Release 5.2 October 1986
General
Second Edition of Handbook (with manual entries)
Handbook examples now online
Manual entries no longer distributed separate from handbook
for infinite F ratios, 9999 is used
Specific Programs
cat: [+] added for MSDOS compatibility
colex: [F] formatted output of columns added
dm: [R] some new operators added, bugs fixed
dm: [F] random seed now follows R[AND] operator
dm: [M] no longer checks for non-numerical inputs (use number(si))
dsort: [+] for sorting data files by columns
ff: [+] for pagination, simple text formatting
fpack: [+] for packing files into plain archives
perm: [F] sorting options added
regress: [R] improved matrix calculations
Changes for Release 5.2 January 1986
General
on-line help in most programs (-LOV options)
|STAT Handbook and new manual entries
on-line manuals on MSDOS
Specific Programs
probdist: [+] 5 probability distributions with random number generation
pof: [-] deleted from distribution (probdist)
chisq: [-] deleted from distribution (probdist)
contab: [+] crosstabs and chi-square program
pair: [F] plotting options added
dataplot: [-] deleted from distribution (use pair plotting options)
anova: [R] program more robust against invalid designs
oneway: [F] error bar plots, unweighted means solution
regress: [F] better support for residual plotting (-e option)
vincent: [-] no longer distributed (use ts -l option)
Changes for Release 5.1 November 1985
General
several minor bugs removed
full package ported to MSDOS
Specific Programs
calc: [R] some syntax bugs fixed, ported to MSDOS
stats: [+] for simple statistics
trans: [-] no longer distributed (dm now on MSDOS)
Changes for Release 5.0 March 1985
General
reworked to increase portability, reliability, usability
most commands now use standard option parser (getopt)
[R] all calculations now done in double precision
[R] improved error messages
[R] better approximations for F-ratios
efficiency of I/O improved
most programs ported to MSDOS
[R] improved random number seeding on UNIX (perm, dm)
standard exit status (0) on successful runs
version control added
Specific Programs
regress: [F] partial correlation analysis
colex: [+] added as faster alternative to dm
trans: [+] added as alternative to dm (later dropped)
--
Name: Gary Perlman | Computer and Information Science Department
Email: perlman at cis.ohio-state.edu | Ohio State University, 228 Bolz Hall
Phone: 614-292-2566 | 2036 Neil Avenue Mall
Fax: 614-785-9837 or 292-9021 | Columbus, OH 43210-1277 USA
More information about the Comp.unix.questions
mailing list