Introduction to Python Programming for Business and Social Science Applications. Frederick Kaefer
of “big data” in disciplines such as bioinformatics, neuroscience, and astronomy has made programming know-how ever more crucial, in that researchers who can write code in Python can manage their data sets and work more efficiently on research-related tasks, including analyzing and visualizing data (Perkel, 2015). The corporate world also recognizes the importance of analyzing data to gain insights about current and potential customers. In addition, corporations are developing business applications using Python for numerous purposes, including developing strategic information systems, including enterprise resource planning (ERP), customer relationship management (CRM), and ecommerce applications (Smets, 2019).
Several advantages that Python has compared to commercial packages such as SPSS and SAS are that Python is open source and can run on many platforms. Users are free to make copies, distribute, and even change the software. Python is perfect for teaching statistics in a data-rich environment and has simplified debugging for the programmer using its built-in debugging feature (Ozgur, Colliau, Rogers, Hughes, & Myer-Tyson, 2017). Like Python, the R programming language is also an open-source programming language used for data analytics. There is prolific use of both R and Python in the business world and for academic and research purposes. We focus on Python because we see a need for a text that presents Python programming specifically for those in the fields of social sciences and business to develop applications for data analytics. Whereas some may prefer R for statistical analysis and plotting charts, Python is a general-purpose scripting language used to develop applications with graphical user interfaces (GUIs) and may be favored when working with text-based data.
Python Is Free, Open-Source Software (FOSS)
Perhaps the most important reason for the rapid growth in the usage of Python in business and the social sciences is that Python is free open-source software (FOSS). FOSS is an inclusive term that covers both free software and open-source software (Marsan, Pare, & Beaudry, 2012). The definition of free software is that the users have the freedom to run, copy, distribute, study, change, and improve the software (Free Software Foundation, 2019). Open-source software requires that the license to use the software shall not restrict any party from selling or giving away the software as a component of a larger software distribution (Open Source Initiative, 2007). As a result, organizations not only are free to use and change Python but also can create and sell commercial applications using Python.
Being FOSS is a true advantage that Python has over other commercially available packages, as it is continually improved. Software development peers iteratively develop, incrementally release, review, and refine FOSS projects in an ongoing agile manner (Scacchi, 2004b, referenced in Goth, 2007). FOSS communities develop software that is extremely valuable, generally reliable, globally distributed, made available for acquisition at little or no cost, and readily used in its associated community (Scacchi, 2004a).
User Community and Python Resources
You can find many Python resources at the Python website, https://www.python.org. You can download the latest version of Python from https://www.python.org/downloads/ (Version 3.8.0 as of October 14, 2019). Python is platform independent, software that can run on most if not all the latest operating systems/computing platforms. A platform is the combination of a physical device and an operating system. You can run the latest version of Python on Windows, Linux/Unix, Mac OS X, and other operating systems. You can find documentation for the latest version of Python (as well as for older versions) at https://docs.python.org/dev/. Table 1.1 lists the most recent versions of Python documentation that were available on the Python website as of November 22, 2019. Previous versions of documentation remain available online as well. The Python Package Index is a repository of software for the Python programming language located at https://pypi.org/.
Table 1.1
Lessons learned: In this section, we learned that Python is free and open-source software (FOSS) and that there are now more than 212,000 projects with packages written in Python that are available to use and modify in the Python Package Index. The goal of this book is to teach Python programming to those in the fields of social sciences and business to develop applications using Python packages for data analytics.
Setting Up a Python Development Environment
One way to set up a Python development environment on a computing device is to connect to the Python download webpage (https://www.python.org/downloads/), as shown in Figure 1.1. Once on that webpage, if you are running Windows, simply click on the Download Python button (for the latest version). If you are not running Windows, select the link that corresponds to the operating system you are using (found immediately below that download button) and follow the instructions found on the corresponding webpage. Note that we will be using the Windows operating system for illustrating Python throughout the book, so if you are using Mac OS or another operating system, you will have some variations in the appearance and detailed workings of Python.
Figure 1.1 Python Download Webpage
The lower half of the Python download webpage shown in Figure 1.1 lists the release dates for different release versions.
Python Versions
Python versions are numbered A.B.C., where A is the major version, B is a minor version number (for incremental changes), and C is a micro-level number (for bug fixes; Python Software Foundation, 2019, “General Python FAQ”). Release Version 3.x has significant changes from release Version 2.x, and code written in each version is not compatible with the other version. Figure 1.1 shows that the release date for release Version 2.7.17 is October 19, 2019. There are still new updates for Python Version 2.x for those who have developed Python code in the past and do not want to make the changes necessary for that code to be compatible with Version 3.x. Python Version 3.x is the recommended version as Version 2.x, although still widely used, will no longer be maintained after January 1, 2020 (Python Software Foundation, 2019, “General Python FAQ”). This book uses Python Version 3.7, and we have tested all Python code in the book using that specific version. If you use either an older or a newer version of Python, there may be some issues with code execution.
Lessons learned: In this section, we learned how to install Python on our computer and that there are different versions of Python. We also learned that using different operating systems and different versions of Python can affect how we write and execute Python code.
Executing Python Code in the IDLE Shell Window
After downloading and installing the Python development environment, you can verify that it is properly functioning. The standard distribution of Python comes with an Interactive Development Environment (IDE) named IDLE (for Integrated Development and Learning Environment). An Interactive Development Environment (IDE) contains facilities for writing and editing code as well as testing and debugging code. IDLE runs on Windows, Mac OS X, and UNIX. Documentation for IDLE can be found at https://docs.python.org/3/library/idle.html. The IDLE Python shell window is an interactive interpreter that can execute lines of Python code one at a time. Figure 1.2 is the IDLE Python shell console on Windows, and Figure 1.3 is the IDLE Python shell console on a Mac. Although mostly similar, there are platform-specific variations. For consistency, we will be using Windows-based Python illustrations throughout the remainder of this textbook.
Figure 1.2 The IDLE