Syllabus
DESCRIPTION
This course focuses on collecting data for scientific research in the Internet age. In particular, it covers three popular data collection methods: experiments, surveys, and observational data. Although these have been the most common data collection methods in the offline world, the uniqueness of this course is that it discusses them in an online context where the data and/or the collection mechanism are online. In particular, we will discuss Internet experiments, online and email surveys, and web crawling and other tools for collecting Internet data.
Emphases of the course will be on designing efficient and ethical data collection plans and executing them. We will discuss statistical, ethical, technological and practical aspects of collecting data in an online environment. We will also cover the academic approval process necessary for conducting human subjects research in the USA.
The course is intended for any PhD student in the business school who will be performing empirical research.
ORGANIZATION
This is a lecture-lab course in which topics are presented by the instructor, and students learn by hands-on homework and projects. Several readings will be assigned, and students are expected to be prepared to discuss them in class.
OBJECTIVES
To introduce students to different data collection tools and the suitability of the generated data for different research purposes.
To introduce students to data collection in an online environment, and its ethical, technological, and statistical aspects.
To educate students about the proper use and evaluation of online data collection tools.
To educate students about collecting data in an academic environment.
To provide a hands-on experience for learning about the various pitfalls of data collection.
To provide students with an opportunity to collect data in practice, before they pursue collecting their own data for their dissertation work.
TOPICS
Introduction: Data collection methods; Types of data and their research value
Collecting Web data
Online and email surveys
Online experiments
READINGS
Allen, G. N., Burk, D. L., and Davis, G. B. 2006. Academic Data Collection in Electronic Environments: Defining Acceptable Use of Internet Resources, MIS Quarterly, vol 30(3), 599-610.
Angst, C. M., Agarwal, R., and Kuruzovich, J. 2008. Bid or Buy: Behavioral Predictors of Strategic Exit in Online Auctions, The International Journal of Electronic Commerce, vol 13(1) 59-84.
Bakos, Y. , Lucas Jr., H. C., Oh, W., Simon G., Viswanathan S., and Weber, B. W. 2005. The Impact of E-Commerce on Competition in the Retail Brokerage Industry, Information Systems Research, vol 16(4), 352-371.
Bapna, R., Goes, P., Gopal, R. and Marsden, J.R.. 2006. Moving from Data-Constrained to Data-Enabled Research: Experiences and hallenges in Collecting, Validating and Analyzing Large-Scale e-Commerce Data, Statistical Science, vol 21(2), 116-130.
Federal Trade Commission: Identity Theft Survey Report. September 2003.
Kenett R. and Shmueli, G. 2011. On Information Quality, Working Paper RHS 06-100, Smith School of Business, University of Maryland.
Shmueli, G., Jank W., and Bapna, R. 2005. Sampling eCommerce Data from the Web: Methodological and Practical Issues, Proceedings of the American Statistical Association, Statistical Computing Section [CD-ROM], Alexandria, VA: American Statistical Association.
Souza, G. C., Bayus, B. L. and Wagner, H. M. 2004. New-product strategy and industry clockspeed, Management Science, vol. 50(4), pp. 537-549.
Suggested:
Hill and Lewicki, Statistics: Methods and Applications, Chapter on Experimental Design (available online)
Best S. J. and Krueger, B.S., Internet Data Collection (ISBN: 0761927107)
Fowler J. F., Survey Research Methods, 3rd edition (ISBN: 0761921907)
Lohr S., Sampling Design and Analysis (ISBN: 0534353614)
Rama Katkar and David H. Reiley. 2006. Public Versus Secret Reserve Prices in eBay Auctions: Results from a Pokémon Field Experiment." Advances in Economic Analysis and Policy, vol 6, Issue 2, Article 7.
Trust and Reputation Building in E-Commerce, Claudia Keser, CIRANO working paper 2002s-75 (or short version in IBM System's Journal, vol 42 no 3, 2003).
SOFTWARE
We will use several software programs and packages in the course:
Google Spreadsheet Surveys and/or Survey Monkey (available online)
Minitab
PHP (open source)
TIBCO Spotfire
Computer Facilities:
The course meetings will take place in the eMarkets Research Lab (VMH3509). Students will have access to the lab and will be able to use its computers and software for working on projects and homework assignments. Alternatively, students can use their own computers and use the required software either by installing it on their computer, or by using it via the Smith Portal.
COURSE POLICIES
Attendance:
Students are expected to attend all class meetings. A student who misses class for a good reason is responsible for delivering any homework or other deliverable on time. Attendance on the last Presentations Session is mandatory. Absence from this session will reduce your final letter grade.
Homework Assignments :
Students are expected to complete all homework assignments and to submit each homework by the delivery date. Unless specified otherwise, students should complete the homework assignments on their own, without consulting with others.
Final Project:
The final project in the course combines the various aspects taught during the class. Each student will work separately on a project, and will present it during the final Presentation Session. In addition to the presentation, students will write a professional project report, to be submitted within two days from the presentation.