National Institute of Statistical Sciences
Report of the Director
The focus of attention during this year was the re-evaluation of the directions to be undertaken by NISS following the unsuccessful bid to gain core funding through the National Science Foundation competition for Mathematical Sciences Research Institutes. NISS exerted strong efforts to expand its presence and general awareness of the role of Statistics in Information Technology (IT).
A major new IT research project was begun (October 1) on Digital Government (see below). The goal is to build Web-based systems that preserve privacy and confidentiality in access and use of Federal databases. This project is being carried out with the cooperation of several Federal statistical agencies (Bureau of Labor Statistics, Census, National Agricultural Statistics Service and National Center for Health Statistics).
Continued research in transportation was expanded to include collaborative efforts with the North Carolina Department of Transportation (NCDOT), the Chicago Department of Transportation (CDOT) and the Bureau of Transportation Statistics (BTS).
Measurement, Modeling and Prediction for Infrastructural Systems. This NSF-funded project began on November 1, 1994. It will continue through the Fall of 2000. Principal investigators are A. Karr (NISS), N. Rouphail (NCSU) and J. Sacks (NISS). Research on travel demand forecasting has concentrated on synthesizing populations from aggregated data and multinomial choice modeling. Participants were C. Bhat (Texas), F. Koppelman (NU), P. Speckman (Missouri) and D. Sun (Missouri). Aspects of this work overlap with the LANL project (see below).
Work on network modeling has addressed equilibria of large network models and the validation of such models. The key figure here is D. Boyce (UIC) with the assistance of P. Nelson (UIC), R. Buck (Western Michigan) and Sacks. Effective code has been developed and computer experiments were successfully conducted to improve the running time of the code. The focus on developing a validation process for the model is under way now. Performance measures are being studied; mode choice and origin/destination parameters have been estimated. Census home-to-work survey is planned for use in validation.
Work was begun to link transportation network models (of the type pursued above) with econometric models of commodity flows within a 5-state Midwest region in order to study the interactions of economic activity on the transportation network of the region. This work has been led by G. Hewings (UIUC) and involved Boyce, Sacks, and P. -W. Cheng (UIC). Input/output models have been successfully implemented. Adapting the network models to this context is underway.
Vehicle matching, visualization of freeway travel times and assessment of algorithms for incident detection were the focus of work completed this year at UC--Berkeley. The work involved P. Bickel and J. Rice (California-Berkeley), Y. Ritov (Hebrew University), with input by Sacks.
A Phase II study of the microsimulator (CORSIM) is near completion. CORSIM is to be used as a platform for evaluating signalization strategies. Preliminary results were presented to CDOT in October, 1999, indicating successful implementation of methods that find good signal plans and serve to evaluate existing plans and plans found through application of other computer codes purporting to find optimal signalization strategies. The methods developed at NISS will likely to be used to evaluate a proposed study of the potentiality of an expensive automated signal system for the city of Chicago. The project involves B. Park (NISS), Rouphail, V. Thakuriah (UIC), Karr and Sacks.
Research on deterioration of concrete, primarily at the Center for Advanced Cement Based Materials at Northwestern University (NU), included experiments for measuring and models for predicting water permeability of cracked samples of concrete. New experimental paradigms and apparatus for measuring water permeability of samples under load were developed. A program of numerical experimentation, to simulate permeability and damage, was undertaken. This work is led by Karr and S. Shah (NU); key participants are B. Ankenman (NU), T. Igusa (NU), J. Picka (NISS, located at NU as a NISS postdoc and now at the University of Maryland, College Park) and C. Aldea (NU).
Statistically-Based Activity Generation. The research necessary to provide regional-scale activity patterns as inputs to the Los Alamos National Laboratory (LANL) developed TRANSIMS model for modeling travel and traffic in Portland, OR has been completed. Models and algorithms have been devised and implemented to produce travel-activity patterns for the population in the Portland metropolitan area. The project is directed by Karr in cooperation with R. Beckman (LANL). Key participants have been Speckman and Sun with added participation of Koppelman, K. Vaughn (Metropolitan Transportation Commission, Oakland, CA) and NISS Junior Fellow, J. Lee.
A cooperative effort is underway with the city of Fayetteville, NC, the Fort Bragg Army base and NCDOT to study the use of TRANSIMS and other travel models for transportation planning in the Fayetteville area. The research team includes J. Stone (NCSU), Rouphail, Park, Karr and Sacks.
ITS Integration of Real-Time Emissions Data and Traffic Management Systems. This project, funded by the ITS/IDEAS program of the Transportation Research Board, has been completed. Field data collected in the project were used to relate pollutant emissions to traffic characteristics at various levels of integration. The principal participants were Karr, Rouphail, C. Frey (NCSU) and C. Gu (Purdue).
A related project was begun to use (automobile) on-board emissions equipment to measure detailed characteristics of vehicular movement (such as speed and acceleration), as well as pollutants. The ultimate goal is to assess the effects of signalization and driver characteristics on emissions output under field conditions. This project involves Rouphail, Frey and Karr.
Indices of Environmental Status and Trend. As part of a consortium (which includes American, George Mason, Maryland-Baltimore County and Penn State) organized by the EPA's new Center of Environmental Information and Statistics (now reorganized within the new EPA Office of Information) NISS has worked on protocols and case studies for the public release and use of environmental data. Sacks has been leading this effort with support from P. Bloomfield (NCSU) and R. Smith (UNC-CH).
Code Decay in Legacy Software Systems. This partnership with Lucent Technologies is essentially complete. Co-project directors are Karr and S. Eick of Lucent. The project is directed at improving the development process for large legacy software systems by application of statistical strategies to model and control the effects of design decisions, software architecture and organizational factors. Key participants were T. Graves (NISS; now at Los Alamos National Laboratory), J. S. Marron (UNC-CH), A. Mockus (Lucent), N. Staudenmayer (Duke) and D. Weiss (Lucent). Principal products of the research during the year were network models for the breakdown of modularity over time, methods to impute change effort, prediction methods for faulty software upgrades, assessment of the economic efficacy of "perfective maintenance:'' (re-engineering software in anticipation of future changes) and methods for visualization of software changes.
Pilot Projects to Explore Large Data Sets. This research effort comprises two interconnected pilot projects dealing with large data sets, each involving a major industrial partner. It is now in its third of a 3-year effort.
Drug Design. The component on drug discovery developed sequential strategies to enable efficient High Throughput Screening of drug compounds for chemical activity. Results have been presented at a variety of meetings; reports have been completed and papers submitted on general sequential strategies, multiple classification trees and pooling. This project is being carried on in close collaboration with Glaxo Wellcome (GW), Research Triangle Park, NC. The effort is led by Sacks and S. Young (GW). Key researchers involved during the year were K. Tatsuoka (NISS), M. Xie (Rutgers) and A. Menius (GW).
During the year, the focus moved into bioinformatics issues, especially gene discovery and expression as well as modeling of protein expression. Involved were Young, Tatsuoka, A. Stark (NISS), F. Seillier-Moiseiwitsch (UNC--CH), B. McNeney (NISS), J. Graham (NISS) and B. Weir (NCSU).
Network Intrusion. The first year of this effort produced strategies for identifying network intruders by modeling user profiles and identifying anomalous patterns. The past year's efforts focused on modeling potential customer ``defections'' through use of network data -- producing statistical strategies for capitalizing on large network data to predict individual user behavior. This research was directed by Karr and D. Pregibon (AT&T). Key participants were M. Schonlau (NISS, now at Rand Corporation), Y. Vardi (Rutgers), W. DuMouchel (AT&T), R. Bell (AT&T) and N. Raghavan (Ohio State, visiting NISS).
North Carolina School of Science and Mathematics. NISS developed a framework, strategy and initial database for evaluation of NCSSM's Educational Futures Center (EFC). The EFC mission is to create a state-wide network of "cybercampuses'' that will carry NCSSM's programs beyond its residential campus in Durham, NC. Innovative electronic and educational technologies are involved. NISS adapted this to create an evaluation arm for a 5-year, $6,500,000 grant to NCSSM from the Office of Education to deal with teacher development in the use of IT for education. This effort was led by T. Suarez (UNC-Fayetteville) with assistance from Karr and Sacks. NISS will remain as a consultant to the evaluation process as it evolves.
Digital Government: Access and Confidentiality of Federal Data Sets. NISS has been funded by the NSF to develop a (prototype) World Wide Web-based system to allow adequate access to confidential data from Federal agencies while maintaining low risk of disclosure. The project has Karr as PI. It involves MCNC (a major network facility in Research Triangle Park, adjacent to NISS), S. Fienberg, G. Duncan and L. Sweeney (Carnegie Mellon), S. Keller-McNulty (LANL), M. Franklin (Maryland) and A. Saalfeld (Ohio State), as well as several Federal agencies: the Bureau of the Census, Bureau of Labor Statistics, National Center for Health Statistics and the National Agricultural Statistics Service (NASS). Work was already begun by Karr, A. Sanil (NISS) and J. Hilden-Minton (NISS) on system architecture and database schema, and to develop aggregation strategies for NASS release of pesticide use in agriculture (the aggregation leading to reduced risk of disclosure of individual users).
BTS. NISS is a sub-contractor in a large (task-order) contract awarded (October 1, 1999) to Battelle Memorial Institute (Columbus, Ohio). The work includes supporting statistical activities of the BTS, analysis of travel survey data and development of intermodal transportation data bases. Specific details remain to be clarified but the prospects are of NISS playing a continuing forward-looking role in statistical problems arising in transportation.
The primary avenue to put NISS on firmer footing is planned to revolve around entering a new competition for Institutes, intended by the NSF. Planning is being dome by a Task Force of the Board of Trustees. The staff has been involved in supplying information on structural aspects of planned proposals.
In addition, efforts are being mounted to position and advance NISS interests in IT. To date one major project is underway: digital government.. NISS is currently planning to pursue more such projects in the future.
A first effort was hurriedly assembled in January 1999, was submitted to DARPA, but did not meet with success. Key parts of that planned project (to develop a highly automatic model/data integration system) are now being reshaped.
Workshops. A Workshop on Statistics and Information Technology was originally scheduled to have been held at NISS on September 16-17, 1999. The intervention of Hurricane Floyd forced cancellation of the workshop; it will be held on November 11-12. (In Appendix \ref{app.sitw} is an announcement of that workshop). This is an effort to advance the discipline's awareness and interests in IT and reflect the position paper that NISS prepared in response to report of the President's Information Technology Advisory Committee (PITAC) (available on-line at www.ccic.gov/ac/report/).
On related, but not directly IT-oriented issues, NISS is collaborating with LANL and the Committee on Applied and Theoretical Statistics (CATS) of the National Research Council to hold a workshop on the Evaluation of Complex Computer Models.
Data Networks. A proposal to the second KDI competition at the NSF was developed in conjunction with MCNC and other organizations to measure and visualize behavior of data networks. It was not funded.
Visualization. A proposal on visualization of association, uncertainty and dependence in large networks (ranging from the Internet to transportation) was submitted to the NSF in July, 1999.
Junior Fellows are generally supported partly by funds from NSF and partly by funds from NISS projects. They work on one or more projects that match their backgrounds and career goals, with time available as well to pursue independent research activities.
Fellows whose appointments at NISS ended during the past year are
Jinko Graham (Ph.D., Biostatistics, University of Washington), has taken a position as Assistant Professor at Simon Fraser University.
Todd Graves (Ph.D., Statistics, Stanford), has taken a position at Los Alamos National Laboratory.
Brad McNeney (Ph.D., Statistics, University of Washington), has taken a position as Assistant Professor at Simon Fraser University.
Jeffrey Picka (Ph.D., Statistics, University of Chicago), has taken a position as Assistant Professor at the University of Maryland.
Matthias Schonlau (Ph.D, Statistics, University of Waterloo), has taken a position with the RAND Corporation in Santa Monica, California.
Kay Tatsuoka (Ph.D., Statistics, Rutgers), has taken a position in the bioinformatics group at Smith-Kline Beecham in Philadelphia.
The status of continuing appointees is as follows:
Jaeyong Lee (Ph.D., Statistics, Purdue), working on Digital Government and Transportation
Byungkyu Park (Ph.D., Civil Engineering, Texas A&M) working on the transportation project
Ashish Sanil (Ph.D, Statistics, Carnegie Mellon), working on the Digital Government project
Alex Stark (Ph.D., Engineering, Cambridge, UK), working on bioinformatics as part of the large data sets project.
New appointees in 1999 are:
Hillel Bar-Gera (Ph.D., Civil Engineering, University of Illinois at Chicago), working on the transportation project
Rainer Spang (Ph.D., Biology, University of Bonn), working on bioinformatics
Harry Zuzan (Ph.D., Statistics, University of Guelph) ), working on bioinformatics.
Computing. Lael Tucker, Computational Systems Manager, left NISS to take up a position in Colorado. He was replaced by Deborah Eberhart.
Publications. NISS technical report are now distributed as Web-accessed documents, rather than hard copy.
Visitors and other Appointments. Visits to NISS were made by D. Daley (Australian National), working on the transportation project, Y.-B. Lim (Korea) working on drug discovery, J. Lynch (South Carolina) working on materials and leading a seminar to develop a potential project, D. Sun (Missouri) working on transportation projects, N. Raghavan (Ohio State) working on network data.
|
Navigation: www.niss.org
> Special Items > Director's Report,
1999
|
