Understanding csv file outputs in Python

Given a csv file with the data

1,10,20,30,40,50,60,70,80,90,100
2,210,220,230,240,250,260,270,280,290,300
3,310,330,340,350,360,370,380,390,400,410

training_data_file = open ("Anaconda3JamesData/james_test_3.csv","r") 
training_data_list = training_data_file.readlines()
training_data_file.close()

count=0
for record in training_data_list:
    print (record)
    count+=1
    pass

print (count)
1,10,20,30,40,50,60,70,80,90,100 
2,210,220,230,240,250,260,270,280,290,300
3,310,330,340,350,360,370,380,390,400,410
3

We can see that each record (each row) is cycled through, and on each iteration, the row from the csv file was displayed. We can see this by the fact that count has increased by 3.

At the moment, each record (row) is seen as a block of data. To make the data more usable, we can split the data. Split will break up a string and add this data to a string array with a defined separator, in this case a comma.

training_data_file = open ("Anaconda3JamesData/james_test_3.csv","r") # FULL TRAINING RECORD
training_data_list = training_data_file.readlines()
training_data_file.close()

count=0
for record in training_data_list:
    # split the record
    all_values = record.split(',')    
    print (all_values)
    count+=1
    pass

print (count)
['1', '10', '20', '30', '40', '50', '60', '70', '80', '90', '100\n'] 
['2', '210', '220', '230', '240', '250', '260', '270', '280', '290', '300\n']
['3', '310', '330', '340', '350', '360', '370', '380', '390', '400', '410\n']
3

all_values in the code is now an array, and specific elements of the array can now be accessed. By changing;

print (all_values)

to

print (all_values[1])

We get the output:

10 
210
310
3

We see that we are now only displaying the second element of each row (because array indexing starts at 0)

Leave a Reply