Project #6
More Doctor Studies (Top 20)
(Due 28 April 2006)

With the success of the previous doctor survey and the information that it provided (Project #4), UPMC has decided to conduct a more thorough study of all of the physicians that work in its emergency rooms.  The current survey records patient information for 100 patients treated by each of the UPMC physicians.  The goal this time is to determine who are the top 20 physicians in the system and again, efficiency is used to determine who the top ones are.

You are to write a program that can read a file containing survey data, determine doctor efficiencies, produce a report on the screen and write another file with data synthesized from the survey.The survey data is arranged similar to the data file for Project #4; that is each line has the form:

lastName, firstName    rating#1  time#1  rating#2  time#2  rating#3  time#3  . . .

There are 100 rating-time pairs for each doctor.  The actual contents of the file begins like this

Alligator, Albert    4  46  8  88  1  21  4  60  3  39  5  67  8  84 and 93 more
Arbuckle, Jon        6  72  6  64  3  51  4  60  2  32  4  58  2  32 and 93 more
Bailey, Beetle       5  63  6  72  3  43  5  57  4  50  3  49  9 101 and 93 more

Your program's first task is to read all the information for each doctor and determine the number of patients for each rating (1 through 9) and the average time for each rating value.  These values represent the synthesized data that you will need to write to a file eventually.  Then, using the counts and average times,  calculate the weighted average efficiency number using the following approach.

  2.0 * average_time / (log (rating+1) * number_of_patients

That is, use the standard efficiency formula from  Project #4 but with average time and multiply it by the number of occurrences of the rating.  Do this for each rating, 1 through 9, and add all results together.  Then, divide the sum by 100 to get the average efficiency number for the doctor.  For example, Dr. Albert Alligator saw 12 patients of rating 1 with an average time of 24.3 minutes and 14 patients of rating 2 with an average time of 34.4 minutes, etc.  Dr. Alligator's weighted average efficiency number is then

( 2.0 * 24.3 / log(2) * 12  +  2.0 * 34.4 / log(3) * 14 + . . . )  /  100

After calculating the number of patients with each rating for a doctor and the average time for each rating and the efficiency number for the doctor,  you have the information needed for the display and file output.  Your program is to display information about the top twenty doctors, i.e., the doctors with the 20 lowest efficiency numbers.  The display should show the doctor's name, efficiency number and the average times for each of the rating categories.  The display should be in order, top doctor first, second next, and so on down to the 20th best.  If Albert Alligator were one of the top twenty, your program should display:

Alligator, Albert   68.24  24.3  34.4  42.5  52.1  61.3  71.3  80.2  87.3  97.0

However, he is not one of the top twenty.  The display should be in a table form with labels to indicate what the values represent.

Your program is also supposed to write information about all doctors to an output file; this file will be processed by someone else to get more statistical information at a later date.  For the file, the program should write the doctor's name, average efficiency number, and nine pairs of values (frequency of a rating, average time for that rating) for every doctor in the study.  For Albert Alligator, the file should contain a line that begins like this:

Alligator, Albert   68.24  12  24.3  14  34.4  16  42.5  21  52.1  12  61.3...

The file must be written in sorted order, so that the doctor with the lowest average efficiency number is first and the one with the highest average efficiency number is last.

Warning:  When calculating the average time a doctor takes with patients at each rating, be careful.  Although every doctor has 100 patients in the survey, there are a few doctors that have no patients with one rating or another.  Two examples that I noticed by glancing over the data are that Blondie Bumstead had no patients at rating 9 and Michael Patterson had no patients at rating 8; there may be others.  For such situations, the average time must be set to zero; doing this will not affect the doctor's average efficiency number or his/her position relative to the other doctors.

Your program must be designed to prompt for the file name containing the input data and must prompt for the synthesized data  file name.  This is because, UPMC wants to use your program each time the survey is conducted and the file names will be different each time.  The survey file this time is called study2.txt  You can find it on the I: drive (I:\jlwolfe\110\study2.txt) or on the P: drive at P:\courses\Spring2006\cosc\cosc110\xxx\information\study2.txt  where xxx is 002 or 003 (your section number).

Hand in a printout of your well-documented program and a printout of the captured display from the screen.  Name both the .cpp file and the output file from the program after yourself.  I might name mine wolfe-p6.cpp and wolfe-results.txt, respectively. Copy both of these files to the handin folder on the P: drive.
 
 

Extra Credit:
a.  Display the name, average efficiency number and the nine average times for the bottom twenty physicians in the study.  Do not show the physicians in the middle - the first 20 are required, the last 20 are for extra credit, showing any of the others will be regarded as wrong.  These last twenty must also be displayed in sorted order based on average efficiency number.

b.  Display the total number of patients seen for each rating, including all doctors together.

c.  For which patient rating do the UPMC doctors have the lowest (best) average efficiency number?  Calculate the average efficiency numbers for all patients at each rating 1, 2, ... and calculate a weighted average of them to answer the question.