Object Processing¶
Project Goals¶
This engineering effort invites you to implement and use a program called objectprocessor
searches a text file for rows of data that match a specific pattern. The objectprocessor
works by reading in a text file in comma-separate value (CSV) format that contains information about a person on each row of the file. Next, the program converts each row of data into an instance of the Person
class, stores that instance inside of a List
, and then performs one of several configurable searches for instances of Person
that match the constraints specified on the command-line of the objectprocessor
. Finally, the program will save all of the matching rows of data in the specified file. In addition to implementing the functions that perform the file input and output and the searching for matching Person
s in the List
, you will use a comprehensive command-line interface, implemented with Typer, that allows you to easily to confirm the files for input and output and the terms for the query for matching people.
Project Access¶
If you are a student enrolled in a Computer Science class at Allegheny College, you can access this assignment by clicking the link provided to you in Discord. Once you click this link it will create a GitHub repository that you can clone to your computer by following the general-purpose instructions in the description of the technical skills. Specifically, you will need to use the git clone
command to download the project from GitHub to your computer. Now you are ready to add source code and documentation to the project!
Expected Output¶
This project invites you to implement a Python program, called objectprocessor
, that features different ways search through a text file to find records that match the parameter values specified on the command-line. Specifically, the records of data in the specified input file should correspond to information about a specific person, organized in the following fashion. It is worth noting that all of the data in the input file is automatically generated by a tool that uses the Faker library and is thus synthetic in nature.
Cindy Burns,Dominican Republic,(102)481-3875,"Pharmacist, hospital",rtorres@example.org
Jason Bailey,Falkland Islands (Malvinas),+1-552-912-2326,Leisure centre manager,daniel51@example.com
Andrew Johnson,Portugal,733-554-3949,"Engineer, site",michael94@example.com
Carol Poole,Isle of Man,365.529.7270,Pensions consultant,piercebrenda@example.com
Riley Gonzalez,Libyan Arab Jamahiriya,7752827092,Materials engineer,osborneeric@example.net
After you finish a correct implementation of all the objectprocessor
's features you can run it with the command poetry run objectprocessor --search-term tylera --attribute email --input-file input/people.txt --output-file output/people.txt
and see that it produces output like the following. It is important to note that since this program is deterministic and not dependent on the performance characteristics of your computer, your implementation of the program should produce exactly the following output. This specific invocation of the program looks for people with records that have an email address containing the search term tylera
. After inputting the 50,000 records from the file called input/people.txt
and converting each one to an object-oriented format, the program searches and ultimately determines that there are four people in the input file that match the search parameters.
🧮 Reading in the data from the specified file input/people.txt
🚀 Parsing the data file and transforming it into people objects
🕵 Searching for the people with an email that matches the search term
'tylera'
✨ Here are the matching people:
- Debra Williams is a Retail merchandiser who lives in Guadeloupe. You can
call this person at 407-035-6634 and email them at tyleranderson@example.com
- Christopher Lin is a Embryologist, clinical who lives in United Kingdom.
You can call this person at (515)580-8082x35082 and email them at
tyleranthony@example.com
- William Valdez is a Air broker who lives in Algeria. You can call this
person at 408.592.1306 and email them at tylerashley@example.net
- Joshua Chaney is a Water engineer who lives in San Marino. You can call
this person at 310.624.7694x64127 and email them at tylerallen@example.net
✨ Saving the matching people to the file output/people.txt
Finally, don't forget that you can display objectprocessor
's help menu and learn more about its features by typing poetry run objectprocessor --help
to show the following output. It is worth noting that all of the parameters to the objectprocessor
program, excepting those connected to completion of command-line arguments or the help menu, are required. This means that the objectprocessor
will produce an error if you do not specify the four required parameters respectively related to the search term, the "attribute" of a person stored in a row of data (e.g., the email address or the country), and both the input file and the output file that will save the search results.
Usage: objectprocessor [OPTIONS]
Input data about a person and then analyze and save it.
â•â”€ Options ─────────────────────────────────────────────────────────────╮
│ * --search-term TEXT [default: None] [required] │
│ * --attribute TEXT [default: None] [required] │
│ * --input-file PATH [default: None] [required] │
│ * --output-file PATH [default: None] [required] │
│ --install-completion Install completion for the │
│ current shell. │
│ --show-completion Show completion for the current │
│ shell, to copy it or customize │
│ the installation. │
│ --help Show this message and exit. │
╰───────────────────────────────────────────────────────────────────────╯
Please note that the provided source code does not contain all of the functionality to produce the output displayed in this section. As the next section explains, you should add the features needed to ensure that objectprocessor
produces the expected output! Although you don't need to add any functionality to the person
module, you will need to address all of the TODO
markers in the process
and main
modules.
Note
Don't forget that if you want to run the objectprocessor
you must use your terminal window to first go into the GitHub repository containing this project and then go into the objectprocessor/
directory that contains the project's source code. Finally, remember that before running the program you must run poetry install
to add its dependencies, such as Pytest for automated testing and Rich for colorful output!
Adding Functionality¶
If you study the file objectprocessor/objectprocessor/process.py
you will see that it has many TODO
markers that designate the sorting algorithms that you must implement so as to ensure that objectprocessor
will produce correct output. For instance, you will need to implement most of the steps in the def extract_person_data(data: str) -> List[person.Person]
function that takes as input all of the text in the input CSV file and produces as output a List
of instances of the Person
class in the person
module. You will also need to implement the both functions that determine if a specific instance of Person
matches the criteria specified on the program's command-line interface and those that perform the file input and output. Finally, you are invited to implement the functions in the main
module that call the functions in process
.
Once you complete a task associated with a TODO
marker, make sure that you delete it and revise the prompt associated with the marker into a meaningful comment! After you revise all of the TODO
markers your project should feature polished source code that is ready for contribution to an open-source project or described on your professional web site. Finally, you will notice that this project does not come with test cases. If you want to both establish a confidence in the correctness of and find defects in your program, then you will need to design, implement, and run your own tests using Pytest!
Running Checks¶
If you study the source code in the pyproject.toml
file you will see that it includes a section that specifies different executable tasks like lint
. If you are in the objectprocessor/
directory that contains the pyproject.toml
file and the poetry.lock
file, the tasks in this section make it easy to run commands like poetry run task lint
to automatically run all of the linters designed to check the Python source code in your program and its test suite. You can also use the command poetry run task black
to confirm that your source code adheres to the industry-standard format defined by the black
tool. If it does not adhere to the standard then you can run the command poetry run task fixformat
to automatically reformat the code!
Along with running tasks like poetry run task lint
, you can leverage the relevant instructions in the technical skills to run the command gatorgrade --config config/gatorgrade.yml
to check your work. If your work meets the baseline requirements and adheres to the best practices that proactive programmers adopt you will see that all the checks pass when you run gatorgrade
. You can study the config/gatorgrade.yml
file in your repository to learn how the GatorGrade program runs GatorGrader to automatically check your program and technical writing.
Note
Don't forget that when you commit source code or technical writing to your GitHub repository for this project, it will trigger the run of a GitHub Actions workflow. If you are a student at Allegheny College, then running this workflow consumes build minutes for the course's organization! As such, you should only commit to your repository once you have made substantive changes to your project and you are ready to confirm its correctness. Before you commit to your repository, you can still run checks on your own computer by using the GatorGrade program to run GatorGrader.
Project Reflection¶
Once you have finished both of the previous technical tasks, you can use a text editor to answer all of the questions in the writing/reflection.md
file. For instance, you should provide the output of the Python program in several fenced code blocks, explain the meaning of the Python source code segments that you implemented, and answer all of the other questions about your experiences in completing this project. A specific goal for this project's reflection is to ensure that you can explain Python source code written in an object-oriented fashion and discuss the trade-offs associated with this approach. For instance, you should understand how the following constructor, implemented in the __init__
method, is used to create a new instance of the Person
class.
def __init__(
self, name: str, country: str, phone_number: str, job: str, email: str
) -> None:
"""Define the constructor for a person."""
self.name = name
self.country = country
self.phone_number = phone_number
self.job = job
self.email = email
Project Assessment¶
Since this project is an engineering effort, it is aligned with the evaluating and creating levels of Bloom's taxonomy. You can learn more about how a proactive programming expert will assess your work by examining the assessment strategy. From the start to the end of this project you may make an unlimited number of reattempts at submitting source code and technical writing that meet all aspects of the project's specification.
Note
Before you finish all of the required deliverables required by this project is worth pausing to remember that the instructor will give advance feedback to any learner who requests it through GitHub and Discord at least 24 hours before the project's due date! Seriously, did you catch that? This policy means that you can have a thorough understanding of ways to improve your project before its final assessment! To learn more about this opportunity, please read the assessment strategy for this site.
Seeking Assistance¶
Emerging proactive programmers who have questions about this project are invited to ask them in either the GitHub discussions forum or the Proactive Programmers Discord server. Before you ask your question, please read the advice concerning how to best participate in the Proactive Programmers community. If you find a mistake in this project, please describe it and propose a solution by creating an issue in the GitHub Issue Tracker.