Created by SmirkyGraphs. Code: GitHub. Source: BOE.

Breakdown

General Info

$525,881.75 was raised from 992 contributions from 29 different states

When Donations Occured

  • More donations later in the quarter 55% of the people donated in September
  • Only 14% of donations were in the first month of the quarter
  • Most of the money was raised outside of weekends, Sunday being the lowest and Tuesday being the highest

What Was Donated

  • 53% (525) were paid by Credit/Debit
  • 45% (448) were paid by Check
  • 992 donations from 904 doners
  • 525881.75 was raised
  • Average was 530
  • Highest was 1174.25, lowest was 1
  • Top 5 most frequent in order were: 1000, 500, 250, 25, 10

Where Donations Came From

  • 29 diffrent States
  • 225 different Cities
  • Top 5 States in order by value: RI, NY, MA, CT, CO
  • Top 5 Cities in order by value: Providence, New York, Barrington, Jamestown, Denver

Where RI Donations Came From

This looks specifically at donations where the person was living in Rhode Island

  • 600 donations from RI
  • Donations from 36 of the 39 municipalities in RI
  • Top 5 cities/towns in order: Providence, Barrington, Jamestown, East Greenwich, Cranston
  • Providence made up 29% of the doners from RI
  • Providence made up 29% of the total donated from RI
  • Counties in order by number of doners: Providence, Washington, Newport, Kent, Bristol
  • Counties in order by sum donated: Providence, Newport, Washington, Bristol, Kent

In State vs. Out of State

Comparing donations based on whether they live in RI or not

  • 60% (600) of donations were from Rhode Island
  • 40% (392) of donations were from another state
  • 47% (248681.22) of money donated was from Rhode Islanders
  • 53% (277200.53) of money donated was from out of state
  • Average in state 414
  • Average out of state 707

Where People Worked

  • 489 unique Employers
  • 65,309 raised from Retirees
  • 38,680 raised from Homemakers
  • 16,731 from Self-Employed
  • 15,261 from "Info Requested" (Left Empty)
  • Top 5 Companies: RI Medical Imaging, Pfizer Inc, General Dynamics, Citizens Bank, Pannone Lopes & Devereaux & West LLC
  • All Values Included

    • 56% (557) Worked In RI
    • 44% (435) Worked Outside RI
    • 44% (230305.00) Of the Money came from people who work in RI
    • 56% (295576.75) Of the Money came from people who work out of state
  • Extras Removed

    • 55% (541) Worked in RI
    • 45% (434) Worked Outside RI

Who Donated

  • 97% of Donations came from Individuals
  • 904 Unique Doners
  • 8 Interest, 7 PAC, 5 Party Donations
  • Top 5 first names: David, Michael, Susan, William, Robert
  • Top 5 last names: Ardaya, Kelly, Richardson, Sipprelle, Rogers

Donations 1k and Over

In state 37% out of state 63%


Data Importing & Exploring

In [1]:
# For data
import pandas as pd
from pandas import Series,DataFrame
import numpy as np

# For visualization
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('darkgrid')
%matplotlib inline
sns.set()
import datetime

my_color = sns.color_palette()
In [2]:
# loading the data
df = pd.read_csv("gina.csv", parse_dates=['receipt_dt'])
# removing personal address
df = df.drop(['address'], axis=1)
In [3]:
# Preview
df.head()
Out[3]:
contbr_nm first_nm last_nm tran_type contb_type receipt_dt contb_amt city state zip employer employ_address employ_city employ_state employ_zip weekday
0 Ingrid Ardaya Ingrid Ardaya Credit/Debit Individual 2017-07-01 5.0 Providence RI 2906 Disabled 11 North Avenue Providence RI 2906 Saturday
1 Ingrid Ardaya Ingrid Ardaya Credit/Debit Individual 2017-07-01 5.0 Providence RI 2906 Disabled 11 North Avenue Providence RI 2906 Saturday
2 Edna Panaggio Edna Panaggio Credit/Debit Individual 2017-07-01 5.0 Cranston RI 02920-4529 Homemaker 200 Hoffman Ave Cranston RI 02920-4529 Saturday
3 Eve Savitzky Eve Savitzky Credit/Debit Individual 2017-07-01 25.0 Providence RI 2906 Homemaker 21 Lincoln Ave Providence RI 2906 Saturday
4 Anna Siegler Anna Siegler Credit/Debit Individual 2017-07-02 50.0 Chicago IL 60637 Retired 5715 S. Kenwood Ave, Apt 4N Chicago IL 60637 Sunday
In [4]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 992 entries, 0 to 991
Data columns (total 16 columns):
contbr_nm         992 non-null object
first_nm          971 non-null object
last_nm           971 non-null object
tran_type         992 non-null object
contb_type        992 non-null object
receipt_dt        992 non-null datetime64[ns]
contb_amt         992 non-null float64
city              991 non-null object
state             991 non-null object
zip               988 non-null object
employer          965 non-null object
employ_address    930 non-null object
employ_city       931 non-null object
employ_state      931 non-null object
employ_zip        916 non-null object
weekday           992 non-null object
dtypes: datetime64[ns](1), float64(1), object(14)
memory usage: 124.1+ KB
In [5]:
df.shape
Out[5]:
(992, 16)
In [6]:
df.contb_amt.sum()
Out[6]:
525881.75

There are 992 Donations in Q3 making a total of $525,881.75
Those missing First/Last name are PAC's/Party donations


When Donations Occured


lets start by how much was raised each day

In [7]:
date_df = df.groupby(['receipt_dt'],as_index=False).sum()
date_df.plot('receipt_dt','contb_amt',figsize=(12,6),marker='', legend=False,
             linestyle='-',color='purple', xlim=('2017-07-01','2017-10-01'))
Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x30e958cfd0>
In [8]:
# Top 5 Days
date_df.sort_values(by='contb_amt',ascending=False).head()
Out[8]:
receipt_dt contb_amt
79 2017-09-27 26710.0
77 2017-09-25 23230.0
49 2017-08-28 22805.0
50 2017-08-29 20925.0
52 2017-08-31 20032.6
In [9]:
count_df = df.groupby(['receipt_dt'],as_index=False).count()
count_df.plot('receipt_dt','contb_amt',figsize=(12,6),marker='',linestyle='-',color='purple', xlim=('2017-07-01','2017-10-01'))
Out[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x30e9688d68>
In [10]:
mean_df = df.groupby(['receipt_dt'],as_index=False).mean()
mean_df.plot('receipt_dt','contb_amt',figsize=(12,6),marker='',linestyle='-',color='purple', xlim=('2017-07-01','2017-10-01'))
Out[10]:
<matplotlib.axes._subplots.AxesSubplot at 0x30ea78b358>
In [11]:
weekday_df = df
weekday_df['weekday'] = pd.Categorical(weekday_df['weekday'], 
        categories=['Monday','Tuesday','Wednesday','Thursday',
                    'Friday','Saturday', 'Sunday'], ordered=True)
In [12]:
weekday_df_sum = weekday_df.pivot_table(index=weekday_df['weekday'], values='contb_amt', 
                 aggfunc='sum').plot(kind='bar',rot=0,legend=False, title='Sum of Donations by Weekday')
In [13]:
weekday_df_count = weekday_df.pivot_table(index=weekday_df['weekday'], values='contb_amt', 
                    aggfunc='count').plot(kind='bar',rot=0, legend=False, title='Count of Donations by Weekday')
In [14]:
weekday_df_sum = weekday_df.pivot_table(index=weekday_df['weekday'], values='contb_amt', 
                    aggfunc='mean').plot(kind='bar',rot=0, legend=False, title='Average Donated by Weekday')
In [15]:
df['month'] = df['receipt_dt'].dt.month
In [16]:
df['receipt_dt'].dt.month.value_counts()
Out[16]:
9    549
8    306
7    137
Name: receipt_dt, dtype: int64
In [17]:
month_sum = df.pivot_table(index=df['month'], values='contb_amt', 
                 aggfunc='sum').plot(kind='bar',rot=0,legend=False, title='Sum of Donations by Month')
In [18]:
month_sum = df.pivot_table(index=df['month'], values='contb_amt', 
                 aggfunc='count').plot(kind='bar',rot=0,legend=False, title='Count of Donations by Month')


What the Donations Were

In [19]:
df.tran_type.value_counts()
Out[19]:
Credit/Debit    525
Check           448
In-Kind          10
Other             9
Name: tran_type, dtype: int64
In [20]:
df.tran_type.value_counts(normalize=True)
Out[20]:
Credit/Debit    0.529234
Check           0.451613
In-Kind         0.010081
Other           0.009073
Name: tran_type, dtype: float64
In [21]:
sns.factorplot('tran_type',data=df,kind="count")
Out[21]:
<seaborn.axisgrid.FacetGrid at 0x30eace8c18>
In [22]:
df['contb_amt'].sum()
Out[22]:
525881.75
In [23]:
df['contb_amt'].mode()
Out[23]:
0    1000.0
dtype: float64
In [24]:
df['contb_amt'].describe()
Out[24]:
count     992.000000
mean      530.122732
std       410.813074
min         1.000000
25%       100.000000
50%       500.000000
75%      1000.000000
max      1174.250000
Name: contb_amt, dtype: float64

I was surprised to see that the average donation was 530 compared to the presidential race when it was only 100
Lowest donation was 1 and highest was 1174

In [25]:
df['contb_amt'].hist(bins=25)
Out[25]:
<matplotlib.axes._subplots.AxesSubplot at 0x30eac7e5f8>
In [26]:
df['contb_amt'].value_counts().head()
Out[26]:
1000.0    380
500.0     140
250.0     110
25.0       63
10.0       53
Name: contb_amt, dtype: int64

Surprisingly 1000 was the most frequent donation
The top most frequent donation values were much higher then those during the presidential race


Where Did Donations Come From?

In [27]:
# 1 State was labeled "Ri" So replace it with RI
df = df.replace(['Ri'],'RI')
In [28]:
df.state.nunique()
Out[28]:
29
In [29]:
df.state.value_counts()
Out[29]:
RI    600
NY     76
MA     62
CT     43
CO     29
TX     28
DC     19
CA     18
MD     17
FL     15
NJ     15
VA     10
IL      8
PA      7
OR      7
AZ      6
NH      5
WA      4
VT      4
TN      3
MI      3
NM      3
HI      2
SC      2
WI      1
AL      1
GA      1
MO      1
NC      1
Name: state, dtype: int64
In [30]:
where_sum = df.pivot_table(index=df['state'], values='contb_amt', aggfunc='sum').sort_values(
    by='contb_amt').plot(kind='barh', rot=0, legend=False, title='Total Donated by State')
In [31]:
where_count = df.pivot_table(index=df['state'], values='contb_amt', aggfunc='count').sort_values(
    by='contb_amt').plot(kind='barh', rot=0, legend=False, title='Count of Donations by State')
In [32]:
where_avg = df.pivot_table(index=df['state'], values='contb_amt', aggfunc='mean').sort_values(
    by='contb_amt').plot(kind='barh', rot=0, legend=False, title='Avg Donated by State')
In [33]:
df.city.nunique()
Out[33]:
225
In [34]:
# Top 5 Cities
city_df = df.pivot_table('contb_amt',index='city',aggfunc='sum')
city_df = city_df.sort_values(by="contb_amt",ascending=False)

city_df.head()
Out[34]:
contb_amt
city
Providence 62241.43
New York 44350.00
Barrington 22470.00
Jamestown 21775.00
Denver 21000.00
In [35]:
city_sum = df.pivot_table(index=df['city'], values='contb_amt', aggfunc='sum').sort_values(
    by='contb_amt').nlargest(5, 'contb_amt').plot(kind='barh', color=my_color, legend=False, title='Total Donated')
In [36]:
city_count = df.pivot_table(index=df['city'], values='contb_amt', aggfunc='count').sort_values(
    by='contb_amt').nlargest(5, 'contb_amt').plot(kind='barh', color=my_color, legend=False, title='Num of Donations')


RI Donations

In [37]:
# Just donations from RI
RI_df = df[df.state == 'RI']
In [38]:
RI_df.city.unique()
Out[38]:
array(['Providence', 'Cranston', 'Lincoln', 'Barrington', 'Cumberland',
       'Pascoag', 'Smithfield', 'East Greenwich', 'Johnston', 'Pawtucket',
       'Jamestown', 'Wakefield', 'Saunderstown', 'Portsmouth', 'Newport',
       'West Warwick', 'North Kingstown', 'Warwick', 'PROVIDENCE',
       'Tiverton', 'Harmony', 'Bristol', 'Warren', 'West Greenwich',
       'Westerly', 'Riverside', 'N Kingstown', 'Rumford', 'Narragansett',
       'Exeter', 'Coventry', 'East Providence', 'South Kingstown',
       'North Providence', 'Middletown', 'Charlestown', 'North Kingstownq',
       'Foster', 'Block Island', 'Scituate', 'Little Compton',
       'New Shoreham', 'Peace Dale', 'Central Falls', 'North Scituate',
       'Glocester', 'E Greenwich', 'Woonsocket', 'Albion', 'Kingston'], dtype=object)
In [39]:
# Connect small towns to the City/Town they're part of
RI_df = RI_df.replace(['Pascoag'],'Burrillville')
RI_df = RI_df.replace(['Wakefield','Kingston','Peace Dale'],'South Kingstown')
RI_df = RI_df.replace(['Saunderstown','N Kingstown','North Kingstownq'],'North Kingstown')
RI_df = RI_df.replace(['E Greenwich'],'East Greenwich')
RI_df = RI_df.replace(['PROVIDENCE'],'Providence')
RI_df = RI_df.replace(['Harmony'],'Glocester')
RI_df = RI_df.replace(['Riverside','Rumford'],'East Providence')
RI_df = RI_df.replace(['Block Island'],'New Shoreham')
RI_df = RI_df.replace(['North Scituate'],'Scituate')
RI_df = RI_df.replace(['Albion'],'Lincoln')
In [40]:
RI_df.city.nunique()
Out[40]:
36
In [41]:
ri_city = df.pivot_table(index=RI_df['city'], values='contb_amt', aggfunc='count').sort_values(ascending=True,
    by='contb_amt').plot(kind='barh', figsize=(12,9), legend=False, title='Num of Donations by City')
In [42]:
ri_city = df.pivot_table(index=RI_df['city'], values='contb_amt', aggfunc='sum').sort_values(ascending=True,
    by='contb_amt').plot(kind='barh', figsize=(12,9), legend=False, title='Total Donated by City')
In [43]:
RI_df['city'].value_counts().sum()
Out[43]:
600
In [44]:
RI_df['city'].value_counts().head()
Out[44]:
Providence        175
East Greenwich     44
Barrington         44
Jamestown          38
Cranston           35
Name: city, dtype: int64
In [45]:
RI_Sum = RI_df.pivot_table('contb_amt',index='city',aggfunc='sum')
RI_Sum = RI_Sum.sort_values(by='contb_amt', ascending=False)

RI_Sum.sum()
Out[45]:
contb_amt    248681.22
dtype: float64
In [46]:
RI_Sum
Out[46]:
contb_amt
city
Providence 70988.43
Barrington 22470.00
Jamestown 21775.00
East Greenwich 19170.00
Cranston 13061.00
North Kingstown 9585.00
Newport 8390.00
Westerly 8305.00
Lincoln 7360.00
East Providence 7170.79
Warwick 6490.00
Narragansett 6310.00
Portsmouth 5505.00
Bristol 5380.00
South Kingstown 4935.00
Charlestown 4100.00
North Providence 3525.00
Pawtucket 3130.00
Johnston 3125.00
Exeter 3035.00
Cumberland 2500.00
Middletown 2375.00
Warren 2100.00
Scituate 1500.00
West Greenwich 1200.00
Coventry 1060.00
Foster 1000.00
Smithfield 950.00
Tiverton 550.00
West Warwick 500.00
Burrillville 500.00
Woonsocket 275.00
Glocester 225.00
Little Compton 100.00
Central Falls 25.00
New Shoreham 11.00
In [47]:
# dictionary of RI Counties
county_map = {'Barrington': 'BRISTOL',
            'Bristol': 'BRISTOL',
            'Burrillville': 'PROVIDENCE',
            'Central Falls': 'PROVIDENCE',
            'Charlestown': 'WASHINGTON',
            'Coventry': 'KENT',
            'Cranston': 'PROVIDENCE',
            'Cumberland': 'PROVIDENCE',
            'East Greenwich': 'KENT',
            'East Providence': 'PROVIDENCE',
            'Exeter': 'WASHINGTON',
            'Foster': 'PROVIDENCE',
            'Glocester': 'PROVIDENCE',
            'Hopkinton': 'WASHINGTON',
            'Jamestown': 'NEWPORT',
            'Johnston': 'PROVIDENCE',
            'Lincoln': 'PROVIDENCE',
            'Little Compton': 'NEWPORT',
            'Middletown': 'NEWPORT',
            'Narragansett': 'WASHINGTON',
            'Newport': 'NEWPORT',
            'New Shoreham': 'WASHINGTON',
            'North Kingstown': 'WASHINGTON',
            'North Providence': 'PROVIDENCE',
            'North Smithfield': 'PROVIDENCE',
            'Pawtucket': 'PROVIDENCE',
            'Portsmouth': 'NEWPORT',
            'Providence': 'PROVIDENCE',
            'Richmond': 'WASHINGTON',
            'Scituate': 'PROVIDENCE',
            'Smithfield': 'PROVIDENCE',
            'South Kingstown': 'WASHINGTON',
            'Tiverton': 'NEWPORT',
            'Warren': 'BRISTOL',
            'Warwick': 'KENT',
            'Westerly': 'WASHINGTON',
            'West Greenwich': 'KENT',
            'West Warwick': 'KENT',
            'Woonsocket': 'PROVIDENCE'}

# creating a party column and mapping party to canidate
RI_df['County'] = RI_df.city.map(county_map)
In [48]:
RI_df['County'].value_counts()
Out[48]:
PROVIDENCE    303
WASHINGTON     86
NEWPORT        77
KENT           70
BRISTOL        64
Name: County, dtype: int64
In [49]:
ri_city = df.pivot_table(index=RI_df['County'], values='contb_amt', aggfunc='count').sort_values(ascending=True,
    by='contb_amt').plot(kind='barh', legend=False, title='Num of Donations by County')
In [50]:
ri_city = df.pivot_table(index=RI_df['County'], values='contb_amt', aggfunc='sum').sort_values(ascending=True,
    by='contb_amt').plot(kind='barh', legend=False, title='Total Donated by County')


In State vs. Out of State

In [51]:
def in_ri(state):
    if state == 'RI':
        return 'in state'
    else:
        return 'out of state'
In [52]:
df['lives'] = df['state'].apply(in_ri)
In [53]:
df_lives = df
In [54]:
df['lives'] = pd.Categorical(df['lives'], categories=['in state','out of state'], ordered=True)
In [55]:
df.head()
Out[55]:
contbr_nm first_nm last_nm tran_type contb_type receipt_dt contb_amt city state zip employer employ_address employ_city employ_state employ_zip weekday month lives
0 Ingrid Ardaya Ingrid Ardaya Credit/Debit Individual 2017-07-01 5.0 Providence RI 2906 Disabled 11 North Avenue Providence RI 2906 Saturday 7 in state
1 Ingrid Ardaya Ingrid Ardaya Credit/Debit Individual 2017-07-01 5.0 Providence RI 2906 Disabled 11 North Avenue Providence RI 2906 Saturday 7 in state
2 Edna Panaggio Edna Panaggio Credit/Debit Individual 2017-07-01 5.0 Cranston RI 02920-4529 Homemaker 200 Hoffman Ave Cranston RI 02920-4529 Saturday 7 in state
3 Eve Savitzky Eve Savitzky Credit/Debit Individual 2017-07-01 25.0 Providence RI 2906 Homemaker 21 Lincoln Ave Providence RI 2906 Saturday 7 in state
4 Anna Siegler Anna Siegler Credit/Debit Individual 2017-07-02 50.0 Chicago IL 60637 Retired 5715 S. Kenwood Ave, Apt 4N Chicago IL 60637 Sunday 7 out of state
In [56]:
count_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='count').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Count of Donation')
In [57]:
print(df['lives'].value_counts())
(df['lives'].value_counts(normalize=True))
in state        600
out of state    392
Name: lives, dtype: int64
Out[57]:
in state        0.604839
out of state    0.395161
Name: lives, dtype: float64

60% (600) Were from Rhode Island
40% (392) Were from another state

In [58]:
mean_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='mean').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Average Donation')
In [59]:
sum_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='sum').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Total Donated')
In [60]:
percent_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='sum')
In [61]:
total_sum = percent_df.contb_amt.sum()
df['lives'] = df['state'].apply(in_ri)
percent_df['Percent'] = percent_df['contb_amt'] / total_sum
In [62]:
percent_df.head()
Out[62]:
contb_amt Percent
lives
in state 248681.22 0.472884
out of state 277200.53 0.527116
In [63]:
mean_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='mean')
mean_df.head()
Out[63]:
contb_amt
lives
in state 414.468700
out of state 707.144209


Where People Worked

In [64]:
employer_df = df.pivot_table('contb_amt',index='employer',aggfunc='sum')

# Combining Electric Boat & Genral Dynamics
employer_df.loc['General Dynamics'] = employer_df.loc['Electric Boat Corporation'] + employer_df.loc['General Dynamics']
employer_df.drop('Electric Boat Corporation',inplace=True)

employer_df = employer_df.sort_values(by = 'contb_amt',ascending=True)
In [65]:
employer_df.count()
Out[65]:
contb_amt    489
dtype: int64

Donations from people who worked at 489 different companies, lets narrow it down to companies over $1000

In [66]:
# Getting all employer records over $1000

employer_df = employer_df[employer_df['contb_amt'] > 1000]
employer_df.plot(kind='barh',figsize=(10,16))
Out[66]:
<matplotlib.axes._subplots.AxesSubplot at 0x30eb1f4780>
In [67]:
# Graphing Only Companies
employer_df.drop('Homemaker',inplace=True)
employer_df.drop('Retired',inplace=True)
employer_df.drop('Self Employed',inplace=True)
employer_df.drop('Info Requested',inplace=True)

employer_df = employer_df.sort_values(by = 'contb_amt',ascending=True)
In [68]:
# Getting all employer records over $1000
employer_df = employer_df[employer_df['contb_amt'] > 1000]
employer_df.plot(kind='barh',figsize=(10,16))
Out[68]:
<matplotlib.axes._subplots.AxesSubplot at 0x30ecb26978>
In [69]:
def in_ri(employ_state):
    if employ_state == 'RI':
        return 'in state'
    else:
        return 'out of state'
In [70]:
df['works'] = df['employ_state'].apply(in_ri)
In [71]:
df.head()
Out[71]:
contbr_nm first_nm last_nm tran_type contb_type receipt_dt contb_amt city state zip employer employ_address employ_city employ_state employ_zip weekday month lives works
0 Ingrid Ardaya Ingrid Ardaya Credit/Debit Individual 2017-07-01 5.0 Providence RI 2906 Disabled 11 North Avenue Providence RI 2906 Saturday 7 in state in state
1 Ingrid Ardaya Ingrid Ardaya Credit/Debit Individual 2017-07-01 5.0 Providence RI 2906 Disabled 11 North Avenue Providence RI 2906 Saturday 7 in state in state
2 Edna Panaggio Edna Panaggio Credit/Debit Individual 2017-07-01 5.0 Cranston RI 02920-4529 Homemaker 200 Hoffman Ave Cranston RI 02920-4529 Saturday 7 in state in state
3 Eve Savitzky Eve Savitzky Credit/Debit Individual 2017-07-01 25.0 Providence RI 2906 Homemaker 21 Lincoln Ave Providence RI 2906 Saturday 7 in state in state
4 Anna Siegler Anna Siegler Credit/Debit Individual 2017-07-02 50.0 Chicago IL 60637 Retired 5715 S. Kenwood Ave, Apt 4N Chicago IL 60637 Sunday 7 out of state out of state
In [72]:
# Including Extras
count_df = df.pivot_table(index=df['works'], values='contb_amt', aggfunc='count').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Count of Donation')
In [73]:
df['works'].value_counts()
Out[73]:
in state        557
out of state    435
Name: works, dtype: int64
In [74]:
# Including Extras
count_df = df.pivot_table(index=df['works'], values='contb_amt', aggfunc='sum').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Total Donated')
In [75]:
# Including Extras getting % of sum
percent_df = df.pivot_table(index=df['works'], values='contb_amt', aggfunc='sum')

total_sum = df.contb_amt.sum()
percent_df['Percent'] = percent_df['contb_amt'] / total_sum

percent_df.head()
Out[75]:
contb_amt Percent
works
in state 230305.00 0.437941
out of state 295576.75 0.562059
In [76]:
# Removing Extras
emp_df = df[df.employer != 'Homemaker']
emp_df = df[df.employer != 'Retired']
emp_df = df[df.employer != 'Self Employed']
emp_df = df[df.employer != 'Info Requested']
emp_df = df[df.employer != 'Disabled']
In [77]:
# Extras Removed
count_df = emp_df.pivot_table(index=df['works'], values='contb_amt', aggfunc='count').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Count of Donation')
In [78]:
emp_df['works'].value_counts()
Out[78]:
in state        541
out of state    434
Name: works, dtype: int64
In [79]:
# Extras Removed
count_df = emp_df.pivot_table(index=df['works'], values='contb_amt', aggfunc='sum').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Total Donated')


Who Donated

In [80]:
df.contbr_nm.nunique()
Out[80]:
904
In [81]:
df.first_nm.value_counts().head()
Out[81]:
David      31
Michael    25
Susan      20
William    19
Robert     19
Name: first_nm, dtype: int64
In [82]:
df.last_nm.value_counts().head()
Out[82]:
Ardaya        16
Kelly          9
Richardson     6
Watson         5
Pande          5
Name: last_nm, dtype: int64
In [83]:
df.contb_type.value_counts()
Out[83]:
Individual              966
Interest Received         8
PAC                       6
In-Kind - Individual      5
In-Kind - Party           4
Refund/Rebate             1
In-Kind - PAC             1
Party                     1
Name: contb_type, dtype: int64
In [84]:
df.contb_type.value_counts(normalize=True)
Out[84]:
Individual              0.973790
Interest Received       0.008065
PAC                     0.006048
In-Kind - Individual    0.005040
In-Kind - Party         0.004032
Refund/Rebate           0.001008
In-Kind - PAC           0.001008
Party                   0.001008
Name: contb_type, dtype: float64


Donations 1k and Over

In [85]:
don_1k = df[df['contb_amt'] >= 1000]
In [86]:
don_1k = df[(df['contb_amt'] >= 1000)]
In [87]:
don_1k.lives.value_counts(normalize=True)
Out[87]:
out of state    0.630208
in state        0.369792
Name: lives, dtype: float64
In [88]:
sum_df = don_1k.pivot_table(index=df_lives['lives'], values='contb_amt', aggfunc='sum').plot(kind='bar',
                        rot=0, color=my_color, legend=False, title='Total Donated')


When Out of State Passes In State

In [89]:
don_df_100 = df[df.contb_amt <= 100]
don_df_250 = df[df.contb_amt <= 250]
don_df_350 = df[df.contb_amt <= 350]
don_df_500 = df[df.contb_amt <= 500]
don_df_750 = df[df.contb_amt <= 750]
don_df_1000 = df[df.contb_amt <= 1000]
In [90]:
# Concatinating the datasets together
frames = [don_df_100, don_df_250, don_df_350, don_df_500, don_df_750, don_df_1000]

don_concat = pd.concat(frames, keys=['100', '250', '350', '500', '750', '1000'])

# resetting the index and dropping the columns we don't need
don_concat = don_concat.reset_index()
In [91]:
# Pivoting by the amt ranges
don_concat = don_concat.pivot_table('contb_amt',index='level_0',columns = 'lives',aggfunc='sum')
In [92]:
new_index= ['100', '250', '350', '500', '750', '1000']
don_concat = don_concat.reindex(new_index)
don_concat.head()
Out[92]:
lives in state out of state
level_0
100 5683.68 2901.53
250 33314.22 10850.53
350 34914.22 11800.53
500 86789.22 30700.53
750 103444.22 35200.53
In [93]:
don_concat[['in state','out of state']].plot(kind='bar',figsize=(12,4))
plt.xlabel('Ammount')
locs, labels = plt.xticks()
plt.setp(labels, rotation=360)
plt.title('In State vs. Out of State')
Out[93]:
<matplotlib.text.Text at 0x30ecdef400>


Top 4 Most Frequent Donations In State vs Out of State

In [94]:
# Top 4 Donated Values
don_25 = df[df.contb_amt == 25]
don_250 = df[df.contb_amt == 250]
don_500 = df[df.contb_amt == 500]
don_1000 = df[df.contb_amt == 1000]
In [95]:
# Concatinating the datasets together
frames = [don_25, don_250, don_500, don_1000]

don_concat = pd.concat(frames, keys=['25', '250', '500', '1000'])
#resetting the index and dropping the columns we don't need
don_concat = don_concat.reset_index()
In [96]:
# Pivoting by the amt ranges
don_concat = don_concat.pivot_table('contb_amt',index='level_0',columns = 'lives',aggfunc='sum')
In [97]:
don_concat.head()
Out[97]:
lives in state out of state
level_0
1000 138000.0 242000.0
25 1275.0 300.0
250 22000.0 5500.0
500 51500.0 18500.0
In [98]:
new_index= ['25', '250', '500', '1000']
don_concat = don_concat.reindex(new_index)
don_concat.head()
Out[98]:
lives in state out of state
level_0
25 1275.0 300.0
250 22000.0 5500.0
500 51500.0 18500.0
1000 138000.0 242000.0
In [99]:
don_concat[['in state','out of state']].plot(kind='bar',figsize=(12,4))
plt.xlabel('Ammount')
locs, labels = plt.xticks()
plt.setp(labels, rotation=360)
plt.title('In State vs. Out of State')
Out[99]:
<matplotlib.text.Text at 0x30ed5c0ba8>