Python 3 pandas.groupby.filter

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

edited Feb 16 at 2:28

weliketocode

690513

asked Feb 15 at 21:45

FinProg

605

3

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
Feb 15 at 22:33

@ALollz: please file a docbug to improve the docstring

– smci
Feb 16 at 2:41

add a comment |

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

edited Feb 16 at 2:28

weliketocode

690513

asked Feb 15 at 21:45

FinProg

605

3

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
Feb 15 at 22:33

@ALollz: please file a docbug to improve the docstring

– smci
Feb 16 at 2:41

add a comment |

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

edited Feb 16 at 2:28

weliketocode

690513

asked Feb 15 at 21:45

FinProg

605

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

python pandas dataframe

edited Feb 16 at 2:28

weliketocode

690513

asked Feb 15 at 21:45

FinProg

605

edited Feb 16 at 2:28

weliketocode

690513

asked Feb 15 at 21:45

FinProg

605

edited Feb 16 at 2:28

weliketocode

690513

edited Feb 16 at 2:28

weliketocode

690513

edited Feb 16 at 2:28

weliketocode

690513

asked Feb 15 at 21:45

FinProg

605

asked Feb 15 at 21:45

FinProg

605

asked Feb 15 at 21:45

FinProg

605

3

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
Feb 15 at 22:33

@ALollz: please file a docbug to improve the docstring

– smci
Feb 16 at 2:41

add a comment |

3

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
Feb 15 at 22:33

@ALollz: please file a docbug to improve the docstring

– smci
Feb 16 at 2:41

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
Feb 15 at 22:33

@ALollz: please file a docbug to improve the docstring

– smci
Feb 16 at 2:41

add a comment |

5 Answers
5

active

oldest

votes

>>> # sort=False to return the rows in the order they originally occurred

>>> df.loc[df.groupby("A", sort=False)["B"].idxmin()]



     A  B    C

0  foo  1  2.0

1  bar  2  5.0

edited Feb 18 at 14:59

answered Feb 16 at 0:20

BallpointBen

3,7681639

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered Feb 15 at 22:39

Wen-Ben

128k83872

add a comment |

There's a fundamental difference: In the documentation example, there is a single Boolean value per group. That is, you return the entire group if the mean is greater than 3. In your example, you want to filter specific rows within a group.

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited Feb 15 at 22:44

answered Feb 15 at 22:19

ALollz

16.9k41838

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered Feb 15 at 21:54

kudeh

490210

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited Feb 15 at 23:55

answered Feb 15 at 22:50

MarianD

4,47761433

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54717473%2fpython-3-pandas-groupby-filter%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

>>> # sort=False to return the rows in the order they originally occurred

>>> df.loc[df.groupby("A", sort=False)["B"].idxmin()]



     A  B    C

0  foo  1  2.0

1  bar  2  5.0

edited Feb 18 at 14:59

answered Feb 16 at 0:20

BallpointBen

3,7681639

add a comment |

>>> # sort=False to return the rows in the order they originally occurred

>>> df.loc[df.groupby("A", sort=False)["B"].idxmin()]



     A  B    C

0  foo  1  2.0

1  bar  2  5.0

edited Feb 18 at 14:59

answered Feb 16 at 0:20

BallpointBen

3,7681639

add a comment |

>>> # sort=False to return the rows in the order they originally occurred

>>> df.loc[df.groupby("A", sort=False)["B"].idxmin()]



     A  B    C

0  foo  1  2.0

1  bar  2  5.0

edited Feb 18 at 14:59

answered Feb 16 at 0:20

BallpointBen

3,7681639

>>> # sort=False to return the rows in the order they originally occurred

>>> df.loc[df.groupby("A", sort=False)["B"].idxmin()]



     A  B    C

0  foo  1  2.0

1  bar  2  5.0

edited Feb 18 at 14:59

answered Feb 16 at 0:20

BallpointBen

3,7681639

edited Feb 18 at 14:59

answered Feb 16 at 0:20

BallpointBen

3,7681639

answered Feb 16 at 0:20

BallpointBen

3,7681639

answered Feb 16 at 0:20

BallpointBen

3,7681639

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered Feb 15 at 22:39

Wen-Ben

128k83872

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered Feb 15 at 22:39

Wen-Ben

128k83872

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered Feb 15 at 22:39

Wen-Ben

128k83872

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered Feb 15 at 22:39

Wen-Ben

128k83872

answered Feb 15 at 22:39

Wen-Ben

128k83872

answered Feb 15 at 22:39

Wen-Ben

128k83872

answered Feb 15 at 22:39

Wen-Ben

128k83872

add a comment |

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited Feb 15 at 22:44

answered Feb 15 at 22:19

ALollz

16.9k41838

add a comment |

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited Feb 15 at 22:44

answered Feb 15 at 22:19

ALollz

16.9k41838

add a comment |

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited Feb 15 at 22:44

answered Feb 15 at 22:19

ALollz

16.9k41838

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited Feb 15 at 22:44

answered Feb 15 at 22:19

ALollz

16.9k41838

edited Feb 15 at 22:44

answered Feb 15 at 22:19

ALollz

16.9k41838

answered Feb 15 at 22:19

ALollz

16.9k41838

answered Feb 15 at 22:19

ALollz

16.9k41838

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered Feb 15 at 21:54

kudeh

490210

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered Feb 15 at 21:54

kudeh

490210

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered Feb 15 at 21:54

kudeh

490210

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered Feb 15 at 21:54

kudeh

490210

answered Feb 15 at 21:54

kudeh

490210

answered Feb 15 at 21:54

kudeh

490210

answered Feb 15 at 21:54

kudeh

490210

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited Feb 15 at 23:55

answered Feb 15 at 22:50

MarianD

4,47761433

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited Feb 15 at 23:55

answered Feb 15 at 22:50

MarianD

4,47761433

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited Feb 15 at 23:55

answered Feb 15 at 22:50

MarianD

4,47761433

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited Feb 15 at 23:55

answered Feb 15 at 22:50

MarianD

4,47761433

edited Feb 15 at 23:55

answered Feb 15 at 22:50

MarianD

4,47761433

answered Feb 15 at 22:50

MarianD

4,47761433

answered Feb 15 at 22:50

MarianD

4,47761433

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Dtyjlui