top of page

Working with CSV files

  • Writer: 라임 샹큼
    라임 샹큼
  • Mar 26
  • 2 min read

I worked with txt files and py files the most but I extended my reach to csv files for data organization and learned about the pandas extension that helps a lot with organizing information.


data = pandas.read_csv('2018_Central_Park_Squirrel_Census_-_Squirrel_Data.csv')

squirrel_data = data.to_dict()

squirrel_color_dict = squirrel_data['Primary Fur Color']

squirrel_colors = []

squirrel_color_count = {}

for color in squirrel_color_dict.values():

        if color not in squirrel_colors:

                squirrel_colors.append(color)

for color in squirrel_colors:

        if color != 'NaN':

                if color not in squirrel_color_count.keys():

                        squirrel_color_count[color] = 0


for squirrel in squirrel_color_dict.values():

        squirrel_color_count[squirrel] += 1



squirrels = {'fur colors':list(squirrel_color_count.keys()),

             'number of':list(squirrel_color_count.values())}

print(squirrels)

print(pandas.DataFrame(squirrels))

With a piece of data that recorded the information of squirrels in New York parks, I tried to generate a code that would organize just the fur color of the squirrels. Although this code worked, it gave me information I didn't need and was too long and wordy. I didn't use the functions of the pandas library well either.


data = pandas.read_csv('2018_Central_Park_Squirrel_Census_-_Squirrel_Data.csv')

gray_squirrel_count = len(data[data['Primary Fur Color']=='Gray'])

red_squirrel_count = len(data[data['Primary Fur Color']=='Cinnamon'])

black_squirrel_count = len(data[data['Primary Fur Color']=='Black'])


Squirrel_data = {'Fur color':['Gray','Red','Black'],

                 'Number':[gray_squirrel_count,red_squirrel_count,black_squirrel_count]} #make it yourself


sql_dta = pandas.DataFrame(Squirrel_data)

print(sql_dta)

sql_dta.to_csv('Squirrel_fur_color.csv')


I thought being able to make the code ready for any amount of change in the information was the best code but I realized if I'm the one the read the code and the code has little information, I could just adjust and set the variables as I pleased. The code above assumes I know the categories in which the information is organized.

The harder, confusing part of this challenge was that I kept forgetting how to use the data[data[] ==_].

It really just means within the data, you have to find the column where the data of a column matches that column name.


Things I learned

  • Can create a list specifically for csv files by importing csv

    Can list up all rows of a csv file with csv.reader(file_name)

import csv weather_list = []with open('weather_data.csv'as data:    read_data = csv.reader(data)    for row in read_data:        weather_list.append(row)print(weather_list)
  • can read file in one short code using pandas library:

import pandas pandas.read_csv('weather_data.csv')
  • can create a data table from dictionary by using DataFrame function preinstalled in pandas

food = {'food':['grapes','berries','apples'],        'price':[20, 10, 45]}food_data = pandas.DataFrame(food)print(food_data)
  • can create a new csv file with pandas

food_data.to_csv('food_data.csv')

 
 
 

Recent Posts

See All
Stock market price alerter

Lately I’ve been getting API data from existing data and using it to get actual live data. This is done by importing requests.  This was...

 
 
 
Making a better quiz ui

I reencountered classes again. I remember I had a a hard time learning how classes were different from simply defining functions. It was...

 
 
 
Getting quotes and displaying them

from tkinter import * import requests def get_quote():     quote_url = requests.get(url=' https://api.kanye.rest ')     quote_json =...

 
 
 

Comments


© 2024 by GifTED. Powered and secured by Wix

bottom of page