Postgraduate Module Descriptor


POLM150: Text as Data

This module descriptor refers to the 2017/8 academic year.

Module Aims

There are three primary aims of the module. First, the module will provide an applied introduction to the use of text analysis in social scientific research. You are introduced to the entire research “pipeline” for a typical text-based project, including: a) collecting textual information online (e.g., web scraping), b) key approaches to text preprocessing and “feature extraction,” and c) supervised and unsupervised approaches to text classification.  These methods are essential for data scientists interested social science questions. Second, the module introduces you to the Python programming language. Python is a popular language for scientific computing and knowledge of Python will place you at a competitive advantage in industry, government, or when pursing further education. Third, the module assessments aim to further reinforce the importance research design and thus provide students with yet another opportunity to hone critical research skills.

Intended Learning Outcomes (ILOs)

This module's assessment will evaluate your achievement of the ILOs listed here - you will see reference to these ILO numbers in the details of the assessment for this module.

On successfully completing the programme you will be able to:
Module-Specific Skills1. apply appropriate tools for collecting and preprocessing textual information;
2. understand and apply a variety of text analysis methods to answer questions in social science and public policy;
3. critically evaluate the strengths and weaknesses of particular text analytic tools for answering research questions in the social and policy sciences;
Discipline-Specific Skills4. employ text analytic methods to empirically evaluate theories and hypotheses in the social and policy sciences;
5. evaluate the role of text analysis for supporting policy analysis and evaluation;
6. construct arguments based on textual data for both written and oral presentation;
7. demonstrate a strong command of research design through written and oral assessments;
Personal and Key Skills8. gain a solid foundation in the Python programming language;
9. communicate effectively in speech and writing;
10. work independently and within a limited time frame to complete a specified task.

Module Content

Syllabus Plan

Although the module’s precise content may vary from year to year, it is envisaged that the syllabus will cover the following topics:

 

  • Programming in Python
  • Collecting textual information online
  • Preprocessing text for analysis and “feature selection”
  • Dictionary-based methods for text classification
  • Supervised and unsupervised learning for text classification
  • Ideological scaling
  • Using text-based measures in regression models

Learning and Teaching

This table provides an overview of how your hours of study for this module are allocated:

Scheduled Learning and Teaching ActivitiesGuided independent studyPlacement / study abroad
22128

...and this table provides a more detailed breakdown of the hours allocated to various study activities:

CategoryHours of study timeDescription
Scheduled Learning and Teaching Activity2211 x 2 hour lectures
Guided Independent Study40Activities to familiarize you with the Python programming language
Guided Independent Study30Reading and preparing for lectures
Guided independent study58Research and analysis for final essay and presentation

Online Resources

This module has online resources available via ELE (the Exeter Learning Environment).

Other Learning Resources