Grouping comma-separated lines together
$begingroup$
I have comma-delimited files like these, where the first field is sorted in increasing order:
Case 1 ( 1st file ) :
abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21
Case 2 ( and another file like this ) :
abcd,1
abcd,21
abcd,122
What I want to do is convert the first file to like this :
abcd 1,21,122
abce 12
abcf 13,21
And similarly, for the second file like this :
abcd 1,21,122
Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....
It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).
def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)
Any better solutions?
python csv
$endgroup$
migrated from stackoverflow.com Jan 13 at 17:33
This question came from our site for professional and enthusiast programmers.
add a comment |
$begingroup$
I have comma-delimited files like these, where the first field is sorted in increasing order:
Case 1 ( 1st file ) :
abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21
Case 2 ( and another file like this ) :
abcd,1
abcd,21
abcd,122
What I want to do is convert the first file to like this :
abcd 1,21,122
abce 12
abcf 13,21
And similarly, for the second file like this :
abcd 1,21,122
Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....
It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).
def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)
Any better solutions?
python csv
$endgroup$
migrated from stackoverflow.com Jan 13 at 17:33
This question came from our site for professional and enthusiast programmers.
3
$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02
add a comment |
$begingroup$
I have comma-delimited files like these, where the first field is sorted in increasing order:
Case 1 ( 1st file ) :
abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21
Case 2 ( and another file like this ) :
abcd,1
abcd,21
abcd,122
What I want to do is convert the first file to like this :
abcd 1,21,122
abce 12
abcf 13,21
And similarly, for the second file like this :
abcd 1,21,122
Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....
It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).
def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)
Any better solutions?
python csv
$endgroup$
I have comma-delimited files like these, where the first field is sorted in increasing order:
Case 1 ( 1st file ) :
abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21
Case 2 ( and another file like this ) :
abcd,1
abcd,21
abcd,122
What I want to do is convert the first file to like this :
abcd 1,21,122
abce 12
abcf 13,21
And similarly, for the second file like this :
abcd 1,21,122
Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....
It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).
def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)
Any better solutions?
python csv
python csv
edited Jan 13 at 19:02
Mast
7,46763787
7,46763787
asked Jan 13 at 17:31
temporaryatemporarya
384
384
migrated from stackoverflow.com Jan 13 at 17:33
This question came from our site for professional and enthusiast programmers.
migrated from stackoverflow.com Jan 13 at 17:33
This question came from our site for professional and enthusiast programmers.
3
$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02
add a comment |
3
$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02
3
3
$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02
$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
- To solve the grouping problem, use
itertools.groupby
. - To read files with comma-separated fields, use the
csv
module.
In almost all cases,
open()
should be called using awith
block, so that the files will be automatically closed for you, even if an exception occurs within the block:
with open(file_path) as in_f, open(destination, 'w') as out_f:
data = csv.reader(in_f)
# code goes here
filePath
violates Python's official style guide, which recommends underscores, like yourcurr_line
.
$endgroup$
1
$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25
add a comment |
$begingroup$
While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.
Use with
to automatically close files when you're done
You risk leaving a file open if an exception is raised and file.close()
is never called.
with open(input_file) as in_file:
Use the object to iterate, not indices
Most collections and objects can be iterated over directly, so you don't need indices
with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
Use data structures to organize your data
In the end, you want to associate a letter-string with a list of numbers. In python, a dict
allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list
of numbers.
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)
Now, this doesn't quite work since, if a letters
entry hasn't been seen yet, the call to data[letters]
won't have anything to return and will raise a KeyError
exception. So, we have to account for that
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
Now, all of the file is stored in a convenient form in the data
object. To output, just loop through the data
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')
The .join()
method creates a string from a list such that the entries of the list are separated by the string that precedes it--','
in this case.
$endgroup$
1
$begingroup$
Instead of trying to append and catching the error, you can usesetdefault
:data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04
$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08
$begingroup$
Or usecollections.defaultdict
of course.
$endgroup$
– Graipher
Jan 14 at 14:24
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211425%2fgrouping-comma-separated-lines-together%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
- To solve the grouping problem, use
itertools.groupby
. - To read files with comma-separated fields, use the
csv
module.
In almost all cases,
open()
should be called using awith
block, so that the files will be automatically closed for you, even if an exception occurs within the block:
with open(file_path) as in_f, open(destination, 'w') as out_f:
data = csv.reader(in_f)
# code goes here
filePath
violates Python's official style guide, which recommends underscores, like yourcurr_line
.
$endgroup$
1
$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25
add a comment |
$begingroup$
- To solve the grouping problem, use
itertools.groupby
. - To read files with comma-separated fields, use the
csv
module.
In almost all cases,
open()
should be called using awith
block, so that the files will be automatically closed for you, even if an exception occurs within the block:
with open(file_path) as in_f, open(destination, 'w') as out_f:
data = csv.reader(in_f)
# code goes here
filePath
violates Python's official style guide, which recommends underscores, like yourcurr_line
.
$endgroup$
1
$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25
add a comment |
$begingroup$
- To solve the grouping problem, use
itertools.groupby
. - To read files with comma-separated fields, use the
csv
module.
In almost all cases,
open()
should be called using awith
block, so that the files will be automatically closed for you, even if an exception occurs within the block:
with open(file_path) as in_f, open(destination, 'w') as out_f:
data = csv.reader(in_f)
# code goes here
filePath
violates Python's official style guide, which recommends underscores, like yourcurr_line
.
$endgroup$
- To solve the grouping problem, use
itertools.groupby
. - To read files with comma-separated fields, use the
csv
module.
In almost all cases,
open()
should be called using awith
block, so that the files will be automatically closed for you, even if an exception occurs within the block:
with open(file_path) as in_f, open(destination, 'w') as out_f:
data = csv.reader(in_f)
# code goes here
filePath
violates Python's official style guide, which recommends underscores, like yourcurr_line
.
edited Jan 13 at 20:52
answered Jan 13 at 18:00
200_success200_success
129k15153415
129k15153415
1
$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25
add a comment |
1
$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25
1
1
$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25
$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25
add a comment |
$begingroup$
While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.
Use with
to automatically close files when you're done
You risk leaving a file open if an exception is raised and file.close()
is never called.
with open(input_file) as in_file:
Use the object to iterate, not indices
Most collections and objects can be iterated over directly, so you don't need indices
with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
Use data structures to organize your data
In the end, you want to associate a letter-string with a list of numbers. In python, a dict
allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list
of numbers.
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)
Now, this doesn't quite work since, if a letters
entry hasn't been seen yet, the call to data[letters]
won't have anything to return and will raise a KeyError
exception. So, we have to account for that
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
Now, all of the file is stored in a convenient form in the data
object. To output, just loop through the data
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')
The .join()
method creates a string from a list such that the entries of the list are separated by the string that precedes it--','
in this case.
$endgroup$
1
$begingroup$
Instead of trying to append and catching the error, you can usesetdefault
:data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04
$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08
$begingroup$
Or usecollections.defaultdict
of course.
$endgroup$
– Graipher
Jan 14 at 14:24
add a comment |
$begingroup$
While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.
Use with
to automatically close files when you're done
You risk leaving a file open if an exception is raised and file.close()
is never called.
with open(input_file) as in_file:
Use the object to iterate, not indices
Most collections and objects can be iterated over directly, so you don't need indices
with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
Use data structures to organize your data
In the end, you want to associate a letter-string with a list of numbers. In python, a dict
allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list
of numbers.
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)
Now, this doesn't quite work since, if a letters
entry hasn't been seen yet, the call to data[letters]
won't have anything to return and will raise a KeyError
exception. So, we have to account for that
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
Now, all of the file is stored in a convenient form in the data
object. To output, just loop through the data
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')
The .join()
method creates a string from a list such that the entries of the list are separated by the string that precedes it--','
in this case.
$endgroup$
1
$begingroup$
Instead of trying to append and catching the error, you can usesetdefault
:data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04
$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08
$begingroup$
Or usecollections.defaultdict
of course.
$endgroup$
– Graipher
Jan 14 at 14:24
add a comment |
$begingroup$
While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.
Use with
to automatically close files when you're done
You risk leaving a file open if an exception is raised and file.close()
is never called.
with open(input_file) as in_file:
Use the object to iterate, not indices
Most collections and objects can be iterated over directly, so you don't need indices
with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
Use data structures to organize your data
In the end, you want to associate a letter-string with a list of numbers. In python, a dict
allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list
of numbers.
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)
Now, this doesn't quite work since, if a letters
entry hasn't been seen yet, the call to data[letters]
won't have anything to return and will raise a KeyError
exception. So, we have to account for that
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
Now, all of the file is stored in a convenient form in the data
object. To output, just loop through the data
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')
The .join()
method creates a string from a list such that the entries of the list are separated by the string that precedes it--','
in this case.
$endgroup$
While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.
Use with
to automatically close files when you're done
You risk leaving a file open if an exception is raised and file.close()
is never called.
with open(input_file) as in_file:
Use the object to iterate, not indices
Most collections and objects can be iterated over directly, so you don't need indices
with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
Use data structures to organize your data
In the end, you want to associate a letter-string with a list of numbers. In python, a dict
allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list
of numbers.
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)
Now, this doesn't quite work since, if a letters
entry hasn't been seen yet, the call to data[letters]
won't have anything to return and will raise a KeyError
exception. So, we have to account for that
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
Now, all of the file is stored in a convenient form in the data
object. To output, just loop through the data
with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string
with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')
The .join()
method creates a string from a list such that the entries of the list are separated by the string that precedes it--','
in this case.
answered Jan 13 at 22:47
Mark HMark H
392110
392110
1
$begingroup$
Instead of trying to append and catching the error, you can usesetdefault
:data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04
$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08
$begingroup$
Or usecollections.defaultdict
of course.
$endgroup$
– Graipher
Jan 14 at 14:24
add a comment |
1
$begingroup$
Instead of trying to append and catching the error, you can usesetdefault
:data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04
$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08
$begingroup$
Or usecollections.defaultdict
of course.
$endgroup$
– Graipher
Jan 14 at 14:24
1
1
$begingroup$
Instead of trying to append and catching the error, you can use
setdefault
: data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04
$begingroup$
Instead of trying to append and catching the error, you can use
setdefault
: data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04
$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08
$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08
$begingroup$
Or use
collections.defaultdict
of course.$endgroup$
– Graipher
Jan 14 at 14:24
$begingroup$
Or use
collections.defaultdict
of course.$endgroup$
– Graipher
Jan 14 at 14:24
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211425%2fgrouping-comma-separated-lines-together%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02