How to convert a pandas MultiIndex DataFrame into a 3D array
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
Suppose I have a MultiIndex DataFrame:
c o l u
major timestamp
ONE 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
TWO 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
I want to generate a NumPy array from this DataFrame with a 3-dimensional, given the dataframe has 15 categories in the major column, 4 columns and one time index of length 5. I would like to create a numpy array with a shape of (4,15,5) denoting (columns, categories, time_index) respectively.
should create an array:
array([[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]]])
One used to be able to do this with pd.Panel:
panel = pd.Panel(items=[columns], major_axis=[categories], minor_axis=[time_index], dtype=np.float32)
...
How would I be able to most effectively accomplish this with a multi index dataframe?
Thanks
python arrays pandas numpy
add a comment |
Suppose I have a MultiIndex DataFrame:
c o l u
major timestamp
ONE 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
TWO 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
I want to generate a NumPy array from this DataFrame with a 3-dimensional, given the dataframe has 15 categories in the major column, 4 columns and one time index of length 5. I would like to create a numpy array with a shape of (4,15,5) denoting (columns, categories, time_index) respectively.
should create an array:
array([[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]]])
One used to be able to do this with pd.Panel:
panel = pd.Panel(items=[columns], major_axis=[categories], minor_axis=[time_index], dtype=np.float32)
...
How would I be able to most effectively accomplish this with a multi index dataframe?
Thanks
python arrays pandas numpy
add a comment |
Suppose I have a MultiIndex DataFrame:
c o l u
major timestamp
ONE 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
TWO 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
I want to generate a NumPy array from this DataFrame with a 3-dimensional, given the dataframe has 15 categories in the major column, 4 columns and one time index of length 5. I would like to create a numpy array with a shape of (4,15,5) denoting (columns, categories, time_index) respectively.
should create an array:
array([[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]]])
One used to be able to do this with pd.Panel:
panel = pd.Panel(items=[columns], major_axis=[categories], minor_axis=[time_index], dtype=np.float32)
...
How would I be able to most effectively accomplish this with a multi index dataframe?
Thanks
python arrays pandas numpy
Suppose I have a MultiIndex DataFrame:
c o l u
major timestamp
ONE 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
TWO 2019-01-22 18:12:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:13:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:14:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:15:00 0.00008 0.00008 0.00008 0.00008
2019-01-22 18:16:00 0.00008 0.00008 0.00008 0.00008
I want to generate a NumPy array from this DataFrame with a 3-dimensional, given the dataframe has 15 categories in the major column, 4 columns and one time index of length 5. I would like to create a numpy array with a shape of (4,15,5) denoting (columns, categories, time_index) respectively.
should create an array:
array([[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]],
[[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05],
[8.e-05, 8.e-05, 8.e-05, 8.e-05, 8.e-05]]])
One used to be able to do this with pd.Panel:
panel = pd.Panel(items=[columns], major_axis=[categories], minor_axis=[time_index], dtype=np.float32)
...
How would I be able to most effectively accomplish this with a multi index dataframe?
Thanks
python arrays pandas numpy
python arrays pandas numpy
edited Feb 10 at 11:34
Brad
asked Feb 10 at 11:25
BradBrad
331311
331311
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
How about using xarray
?
res = df.to_xarray().to_array()
Result is an array of shape (4, 15, 5)
In fact the docs now recommend this as an alternative to pandas Panel
. Note that you must have the xarray
package installed.
add a comment |
Since df.values
is a (15*100, 4)
-shaped array, you can call reshape
to make it a (15, 100, 4)
-shaped array:
arr = df.values.reshape(15, 100, 4)
Then call transpose
to rearrange the order of the axes:
arr = arr.transpose(2, 0, 1)
Now arr
has shape (4, 15, 100)
.
Using reshape/transpose
is ~960x faster than to_xarray().to_array()
:
In [21]: df = pd.DataFrame(np.random.randint(10, size=(15*100, 4)), index=pd.MultiIndex.from_product([range(15), range(100)], names=['A','B']), columns=list('colu'))
In [22]: %timeit arr = df.values.reshape(15, 100, 4).transpose(2, 0, 1)
3.31 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [24]: %timeit df.to_xarray().to_array()
3.18 ms ± 24.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [25]: 3180/3.31
Out[25]: 960.7250755287009
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54615882%2fhow-to-convert-a-pandas-multiindex-dataframe-into-a-3d-array%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
How about using xarray
?
res = df.to_xarray().to_array()
Result is an array of shape (4, 15, 5)
In fact the docs now recommend this as an alternative to pandas Panel
. Note that you must have the xarray
package installed.
add a comment |
How about using xarray
?
res = df.to_xarray().to_array()
Result is an array of shape (4, 15, 5)
In fact the docs now recommend this as an alternative to pandas Panel
. Note that you must have the xarray
package installed.
add a comment |
How about using xarray
?
res = df.to_xarray().to_array()
Result is an array of shape (4, 15, 5)
In fact the docs now recommend this as an alternative to pandas Panel
. Note that you must have the xarray
package installed.
How about using xarray
?
res = df.to_xarray().to_array()
Result is an array of shape (4, 15, 5)
In fact the docs now recommend this as an alternative to pandas Panel
. Note that you must have the xarray
package installed.
answered Feb 10 at 11:40
Josh FriedlanderJosh Friedlander
3,1911933
3,1911933
add a comment |
add a comment |
Since df.values
is a (15*100, 4)
-shaped array, you can call reshape
to make it a (15, 100, 4)
-shaped array:
arr = df.values.reshape(15, 100, 4)
Then call transpose
to rearrange the order of the axes:
arr = arr.transpose(2, 0, 1)
Now arr
has shape (4, 15, 100)
.
Using reshape/transpose
is ~960x faster than to_xarray().to_array()
:
In [21]: df = pd.DataFrame(np.random.randint(10, size=(15*100, 4)), index=pd.MultiIndex.from_product([range(15), range(100)], names=['A','B']), columns=list('colu'))
In [22]: %timeit arr = df.values.reshape(15, 100, 4).transpose(2, 0, 1)
3.31 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [24]: %timeit df.to_xarray().to_array()
3.18 ms ± 24.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [25]: 3180/3.31
Out[25]: 960.7250755287009
add a comment |
Since df.values
is a (15*100, 4)
-shaped array, you can call reshape
to make it a (15, 100, 4)
-shaped array:
arr = df.values.reshape(15, 100, 4)
Then call transpose
to rearrange the order of the axes:
arr = arr.transpose(2, 0, 1)
Now arr
has shape (4, 15, 100)
.
Using reshape/transpose
is ~960x faster than to_xarray().to_array()
:
In [21]: df = pd.DataFrame(np.random.randint(10, size=(15*100, 4)), index=pd.MultiIndex.from_product([range(15), range(100)], names=['A','B']), columns=list('colu'))
In [22]: %timeit arr = df.values.reshape(15, 100, 4).transpose(2, 0, 1)
3.31 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [24]: %timeit df.to_xarray().to_array()
3.18 ms ± 24.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [25]: 3180/3.31
Out[25]: 960.7250755287009
add a comment |
Since df.values
is a (15*100, 4)
-shaped array, you can call reshape
to make it a (15, 100, 4)
-shaped array:
arr = df.values.reshape(15, 100, 4)
Then call transpose
to rearrange the order of the axes:
arr = arr.transpose(2, 0, 1)
Now arr
has shape (4, 15, 100)
.
Using reshape/transpose
is ~960x faster than to_xarray().to_array()
:
In [21]: df = pd.DataFrame(np.random.randint(10, size=(15*100, 4)), index=pd.MultiIndex.from_product([range(15), range(100)], names=['A','B']), columns=list('colu'))
In [22]: %timeit arr = df.values.reshape(15, 100, 4).transpose(2, 0, 1)
3.31 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [24]: %timeit df.to_xarray().to_array()
3.18 ms ± 24.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [25]: 3180/3.31
Out[25]: 960.7250755287009
Since df.values
is a (15*100, 4)
-shaped array, you can call reshape
to make it a (15, 100, 4)
-shaped array:
arr = df.values.reshape(15, 100, 4)
Then call transpose
to rearrange the order of the axes:
arr = arr.transpose(2, 0, 1)
Now arr
has shape (4, 15, 100)
.
Using reshape/transpose
is ~960x faster than to_xarray().to_array()
:
In [21]: df = pd.DataFrame(np.random.randint(10, size=(15*100, 4)), index=pd.MultiIndex.from_product([range(15), range(100)], names=['A','B']), columns=list('colu'))
In [22]: %timeit arr = df.values.reshape(15, 100, 4).transpose(2, 0, 1)
3.31 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [24]: %timeit df.to_xarray().to_array()
3.18 ms ± 24.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [25]: 3180/3.31
Out[25]: 960.7250755287009
edited Feb 10 at 11:52
answered Feb 10 at 11:32
unutbuunutbu
562k10612141268
562k10612141268
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54615882%2fhow-to-convert-a-pandas-multiindex-dataframe-into-a-3d-array%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown