FE 459
Spring 2026

Assignment 12: Databases and SQL

Introduction/Discussion on Wednesday 4/22/2026
Workshop in class on Monday 4/27/2026
Submission due by 9:00 p.m. on Wednesday 4/29/2026.

Preliminaries

In your work on this assignment, make sure to abide by the collaboration policies of the course.

For each problem in this problem set, we will be writing or evaluating some Python code. You are encouraged to use the Spyder IDE which will be discussed/presented in class, but you are welcome to use another IDE if you choose.

If you have questions while working on this assignment, please post them on Piazza! This is the best way to get a quick response from your classmates and the course staff.

Programming Guidelines

  • Refer to the class Coding Standards for important style guidelines. The grader will be awarding/deducting points for writing code that comforms to these standards.

  • Every program file must begin with a descriptive header comment that includes your name, username/BU email, and a brief description of the work contained in the file.

  • Every function must include a descriptive docstring that explains what the function does and identifies/defines each of the parameters to the function.

  • Your functions must have the exact names specified below, or we won’t be able to test them. Note in particular that the case of the letters matters (all of them should be lowercase), and that some of the names include an underscore character (_).

  • If a function takes more than one input, you must keep the inputs in the order that we have specified.

  • You should not use any Python features that we have not discussed in class or read about in the textbook.

  • Your functions do not need to handle bad inputs – inputs with a type or value that doesn’t correspond to the description of the inputs provided in the problem.

  • You must test each function after you write it.

Warnings: Individual Work, Generative AI, and Academic Conduct!!

  • This is an individual assignment. You may discuss the problem statement/requirements, Python syntax, test cases, and error messages with your classmates. However, each student must write their own code without copying or referring to other student’s work.

  • It is strictly forbidden to use any code that you find from online websites including but not limited to as CourseHero, Chegg, or any other sites that publish homework solutions.

  • It is strictly forbidden to use any generative AI (e.g., ChatGPT, Claude, Gemini, CoPilot or any similar tools**) to write solutions for any assignment.

Students who submit work that is not authentically their own individual work will earn a grade of 0 on this assignment and a reprimand from the office of the Dean.

If you have questions while working on this assignment, please post them on Piazza! This is the best way to get a quick response from your classmates and the course staff.

Objective and Overview

Objective: Develop experience using databases and the structured query language.

Task 1: Writing SQL Queries

80 points; individual-only

In this task, you will write several progressively more challenging SQL queries to obtain results from the sqlite3 database contained in the file portfolio.db.

You must save these as python string variables in the python file a12task1.py (see sample in question 0).

You must save each query with the specified variable name so that the autograder script can find it. You will NOT submit your portfolio.db file; rather, the autograder will use a different version of this file that contains the same database tables, but different records. If your queries are created corrected, they will work on any data that follows the specified database schema.

To simplify running the queries, we will use the pandas library, which contains a helper function to execute an SQL query and return the results as a pd.DataFrame object. An example is shown in the starter code file.

Begin by downloading the two files: portfolio.db and a12task1.py. Do this entire task in the file a12task1.py, which contains some import statements and a helper function run_query.

  1. To begin, you must discover the names of the tables and the field names for each table. Here are four initial queries to try. You will notice these in the starter code file:

        example_0a = '''SELECT name FROM sqlite_master'''
        example_0b = '''pragma table_info(clients)'''
        example_0c = '''pragma table_info(trades)'''
        example_0d = '''pragma table_info(price_history)'''
    
        ################################################################################
        if __name__ == '__main__':
    
            # Ask Pandas to run a query and return the resulting data set as a pd.DataFrame object:
            print(run_query(example_0a))
    

    Notice that the only statements in the global scope are variable declarations and comments. Do not print anything in the global scope, as it will cause the autograder to fail.

  2. Write a SELECT query to retrieve all records from the price_history table for the date of 2020-12-31, in order by security.
    Save the query as a string in the variable named sql_01.

  3. Write a SELECT query to retrieve all records from the trades table for the client Angela Merkel, in order by trade_date. Save the query as a string in the variable named sql_02.

    Hints:

    • You may use a single table for this query, and specify the client_id in the WHERE clause.
  4. Write a SELECT query to retrieve all records from the trades table corresponding to trades made during the calendar year 2018. Sort the records by trade_date. Save the query as a string in the variable named sql_03.

    Hints:

    • You may use a single table for this query, and specify the trade_date in the WHERE clause.

    • Use the boolean AND to join multiple conditions together.

  5. Write a SELECT query to produce a report of all securities and the number of trades for each, in order by security. Save the query as a string in the variable named sql_04.

    Hints:

    • Use the COUNT aggregate function and an appropriate GROUP BY clause.
  6. Write a SELECT query to produce a report showing all clients, and the number of trades for each client. Your results must include the client’s first and last names, as well as a count. Sort the results in descending orer by the number of trades. Save the query as a string in the variable named sql_05.

    Hints:

    • Use the INNER JOIN clause to join two tables together into one result set. Specify the column from each table on which to join in the ON clause.

    • Use the COUNT aggregate function and an appropriate GROUP BY clause.

  7. Write a SELECT query to produce a report showing all trades of the security CSCO. Your result set must include the date of the trade, the quantity traded, and the client’s first and last names. Sort the results by the date of the trade. Save the query as a string in the variable named sql_06.

    Hints:

    • Use the INNER JOIN clause to join two tables together into one result set. Specify the column from each table on which to join in the ON clause.
  8. Write a SELECT query to produce a report showing all trades by all clients for the calendar year of 2019. Your result set must include the date of the trade, the client’s first and last names, the security traded, and the quantity of the trade. Sort the results by the date of the trade. Save the query as a string in the variable named sql_07.

  9. Write a SELECT query to produce a report showing all trades (for all dates) by Angela Merkel, including the date of the trade, the security traded, the quantity, and the price (we assume a single price per date, i.e., the adjusted closing price).

    Your result set must include the date of the trade, the security traded, and the quantity of the trade. We do not need to include the client’s name since we will be specifying a single client.

    Sort the results in ascending order by the date of the trade. Save the query as a string in the variable named sql_08.

    Hints:

    • You will need to join together 2 tables: trades and price_history.

    • Use the WHERE clause to specify that results should be for a single client only.

    For example, the report of all trades by Hermione Granger might look like this:

        trade_date  security    quantity    price
    0   2002-13-02  AMZN        100.0       24.110001
    1   2004-10-11  NKE         300.0       4.588508
    2   2008-08-05  WMT         400.0       45.496872
    3   2009-02-05  DIS         300.0       16.111172
    4   2010-11-05  VZ          100.0       21.618628
    5   2011-01-26  UNH         200.0       34.745735
    6   2011-05-17  PFE         400.0       15.394896
    7   2011-13-01  UTX         250.0       63.199425
    8   2013-01-24  DIS         250.0       35.122239
    9   2013-05-16  BA          250.0       81.244209
    10  2013-05-20  AAPL        250.0       56.105534
    11  2013-09-30  UTX         300.0       92.840584
    13  2012-01-31  MCD         400.0       78.082794
    13  2012-09-10  AXP         200.0       80.801811
    12  2015-04-07  PFE         50.0        28.807238
    15  2016-05-25  AXP         200.0       61.259422
    16  2020-02-10  UTX         200.0       155.808517
    

    * NOTE: The sample output was produced with a *different database, so you will not match these exact results. ****

  10. Write a SELECT query to produce a report showing the value of all securities in
    Angela Merkel‘s portfolio as of 2020-12-31.

    Your result set must include which security, the total quantity owned, the price per share, and the value of the position (quantity times price). We do not need to include the client’s name since we will be specifying a single client.

    Sort the results in descending order by the value of the position. Save the query as a string in the variable named sql_09.

    Hints:

    • You will need to join together 2 tables: trades and price_history.

    • Use the WHERE clause to specify that results should be for a single client only, and to specify date.

    • Use the aggregate function SUM and an appropriate GROUP BY clause to calculate the total quantity for each security.

    For example, a partial report of all positions held by Hermione Granger on 2020-03-31 might look like this:

    security    quantity    price   value
    0   AMZN    100.0   1949.719971 194971.997070
    1   UTX     750.0   94.330002   70747.501373
    2   MCD     400.0   165.350006  66120.002441
    3   AAPL    250.0   254.289993  63572.498322
    4   DIS     550.0   96.599998   53139.999161
    5   WMT     450.0   113.620003  51139.001336
    6   UNH     200.0   249.380005  49876.000977
    7   BA      250.0   129.139999  37284.999847
    8   AXP     400.0   85.137077   34054.830933
    9   NKE     300.0   82.739998   24821.999359
    10  PFE     450.0   32.639999   12687.999725
    11  VZ      100.0   53.158306   5315.830613
    

    * NOTE: The sample output was produced with a different database, so you will not match these exact results. *

    Your results will be different – for Angela Merkel, and for 2020-12-31 – so you will need to check them for correctness!

  11. Write a SELECT query to produce a report showing entire portfolio, with cost basis and current value for all positions (i.e., individual purchases) held in
    of Angela Merkel‘s portfolio as of 2020-12-31.

    The result set must include each trade, the security, the quantity bought in that trade, the price per share on the date of purchase, and the cost of the position (quantity X purchase price), the current price per share (as of 2020-12-31), and the current value of the position (quantity X current price).

    We do not need to include the client’s name since we will be specifying a single client.

    Sort the results in ascending order by the security’s name, and by the date on which the position was acquired. Save the query as a string in the variable named sql_10.

    Hints:

    • You will need to do 2 JOINs, to join together 2 tables:

      In the first JOIN, you will connect the each trade to the price_history, to find the historical price as of the trade date.

      In the second JOIN, you will connect the trade to the price_history as of 2020-12-31, to find the ‘current price’.

    • Give each JOIN its own alias, i.e.,

          SELECT [table_a.field1, table_z.field2, ...]
          FROM table_a
          INNER JOIN table_b as "table_z"
          ON [.... ]
      

      When you do this, you can refer to fields by the alias’ table name (i.e., table_z in the example above). This will help you prevent ambiguity when refering to the prices.

    • Use the WHERE clause to specify that results should be for a single client only, and to specify date.

    For example, a partial report of the positions held by Hermione Granger on 2020-03-31 might look like this:

        trade_date  security    quantity    purch. price    cost            current price   value
    0   2005-05-23  AMZN        100.0       35.680000       3568.000031     1949.719971     194971.997070
    1   2011-07-13  AXP         100.0       46.076759       4607.675934     85.137077       8513.707733
    2   2008-02-08  BA          50.0        57.977184       2898.859215     129.139999      7456.999969
    3   2006-02-27  CSCO        50.0        15.313878       765.693903      38.940796       1947.039795
    4   2009-09-16  DIS         250.0       24.429390       6107.347488     96.599998       24129.999619
    5   2016-06-21  DIS         200.0       93.231056       18646.211343    96.599998       19319.999695
    6   2015-02-10  GOOG        200.0       535.469849      107093.969727   1162.810059     232562.011719
    7   2005-10-17  KO          200.0       11.136735       2225.346947     44.250000       8850.000000
    8   2005-13-06  KO          200.0       11.394858       2278.971672     44.250000       8850.000000
    9   2009-08-10  KO          50.0        16.218536       810.926819      44.250000       2213.500000
    10  2011-03-24  KO          50.0        23.625408       1181.270409     44.250000       2213.500000
    11  2018-07-10  MCD         250.0       153.943283      38485.820770    165.350006      41337.501526
    13  2004-11-08  MRK         250.0       12.770099       3692.524672     76.940002       19235.000610
    13  2005-11-07  MRK         250.0       17.225212       4306.303501     76.940002       19235.000610
    12  2011-02-12  PFE         300.0       13.739881       4131.964169     32.639999       9791.999817
    15  2015-07-24  PFE         300.0       28.810135       8643.040466     32.639999       9791.999817
    16  2003-01-13  PG          50.0        25.874138       1393.706417     109.271553      5463.577652
    17  2009-06-24  SO          400.0       18.685911       7474.364471     54.139999       21655.999756
    18  2009-11-03  VZ          400.0       16.533220       6613.288116     53.158306       21363.322449
    19  2011-03-25  VZ          400.0       24.430988       9772.395325     53.158306       21363.322449
    20  2017-13-07  WMT         300.0       92.251381       27675.412276    113.620003      34086.000824
    

    * NOTE: The sample output was produced with a *different database, so you will not match these exact results. ****

    Your results will be different – for Angela Merkel, and for 2020-12-31 – so you will need to check them for correctness!


Submitting Your Work

20 points; will be assignmed by code review

Use the link to GradeScope (left) to submit your work.

Be sure to name your files correctly!

Under the heading for Assignment 12, attach these 2 required file: a12task1.py and portfolio.db.

When you upload the files, the autograder will test your functions/programs.

Warning: Beware of Global print statements

  • The autograder script cannot handle print statements in the global scope, and their inclusion causes this error:

autograder_fail

*   Why does this happen? When the autograder imports your file, the `print` 
    statement(s) execute (at import time), which causes this error.

*   You can prevent this error by not having any `print` statements in the global scope.
    Instead, create an `if __name__ == '__main__':` section at the bottom of the file, and
    put any test cases/print statements in that controlled block. For example:

        if __name__ == '__main__':

            ## put test cases here:
            print('fv_lump_sum(0.05, 2, 100)', fv_lump_sum(0.05, 2, 100))

*   `print` statements inside of functions do not cause this problem.

Notes: