python - 1 列有一个 int。另一个有一个整数列表。如何将数据帧转换为这些对的 numpy rec 数组?
问题描述
这是对这个问题的跟进
数百万对单个 int 与一批(2 到 100)个 int 配对的最佳数据类型(就速度/RAM 而言)
它询问存储成对的单个整数的最佳方式是什么:成批的整数。
答案是使用 np.rec,这是一种创建混合类型数组的便捷方法,允许我们将单个数字和批次彼此相邻放置。
该代码的结果如下所示:
rec.array([( 2955637, array([ 2557706, 7612432, 9348232, 462772, 8018521, 1811275,
9230331, 7023852, 9392270, 4693741, 7854644, 5233547,
12446986, 9534800, 2133753, 5971332, 2156690, 12031365,
4433539, 11607217, 3461811, 5361706, 11282946, 14548809,
8109194, 1199299, 7576507, 12035216, 6635766, 4158077,
5403991, 212711, 1703853, 2094248, 7005438, 951244,
6314059, 11616582, 13002385, 761714, 14016603, 14981654,
8946411, 10050035, 658239, 1693614], dtype=int32)),
( 822302, array([ 2579065, 14360524, 4489101, 14753709, 7440511, 2202626,
504487, 8539709, 6309347, 9028007, 4103133, 6899943,
9391766, 1104058, 10155666, 2845288, 10488737, 1728141,
3976034, 13648527, 6125367, 14690826, 7387347, 7766092,
8717468, 4088448, 2051190, 7914318, 14346922, 13792566,
10343601], dtype=int32)),
( 7777177, array([ 7067232, 11850092, 10343145, 2705178, 9676842, 13392954],
dtype=int32)),
( 7094192, array([ 667930, 2256509, 2860846, 8740657, 3188292, 616645,
12264189, 3827714, 1197702, 11838296, 8450768, 6224672,
10233979, 720212, 13010797, 10508000, 485815, 4040839,
5690852, 8699534, 7200456, 9946306, 14594793, 406437,
5148634, 11229656, 5497334, 3438910, 8301374, 9274725,
4141693, 8846590, 14372346, 1294167, 6341159, 7003319,
7803775, 13882589, 4289922, 14872568, 8094153, 3783601,
12847787, 13833383, 2996757, 12961865, 4205083, 12390923,
5705005, 8842488, 6230348, 5690850, 7154638, 10787173,
10200101, 13943625, 373645, 5115795, 7105045, 899756,
6020046], dtype=int32)),
( 3913008, array([ 5132516, 309940, 7487946, 2927897, 6294641, 701812,
11043226, 7788088, 7465944, 2077922, 13552610, 6345947,
187965, 14830364, 8483266, 8128046, 3227008, 4159033,
12652217, 1919861, 4529511, 2186353, 7407808, 5604777,
13500413, 786580, 7588024, 303460, 13426737, 7131729,
8763962, 5498921, 13099372, 4330432, 5795060, 8424029,
14073436, 2315788, 5657156, 10177080, 4476134, 13418083,
6874374, 1786599, 8115421, 11373555, 1186217, 1098336,
160627, 9177101, 14888415, 11619492, 13326025, 13129137,
10589806, 2659293, 7845901, 6619936, 1939703, 7692026],
dtype=int32)),
就我而言,我的数据存储在 pandas 数据框中。对于每一行,一列有一个 int,另一列有一个 python 整数列表。
如何将其转换为上面的 np.rec 数组格式,例如
rec.array([( int, array([ bunch of ints]) ), (int, array([ bunch of ints]) ), . . . .
第一对将是第一行,第二对将是第二行,依此类推。
解决方案
数据:
data = np.rec.array([( 2955637, np.array([ 2557706, 7612432, 9348232, 462772, 8018521, 1811275,
9230331, 7023852, 9392270, 4693741, 7854644, 5233547,
12446986, 9534800, 2133753, 5971332, 2156690, 12031365,
4433539, 11607217, 3461811, 5361706, 11282946, 14548809,
8109194, 1199299, 7576507, 12035216, 6635766, 4158077,
5403991, 212711, 1703853, 2094248, 7005438, 951244,
6314059, 11616582, 13002385, 761714, 14016603, 14981654,
8946411, 10050035, 658239, 1693614], dtype=np.int32)),
( 822302, np.array([ 2579065, 14360524, 4489101, 14753709, 7440511, 2202626,
504487, 8539709, 6309347, 9028007, 4103133, 6899943,
9391766, 1104058, 10155666, 2845288, 10488737, 1728141,
3976034, 13648527, 6125367, 14690826, 7387347, 7766092,
8717468, 4088448, 2051190, 7914318, 14346922, 13792566,
10343601], dtype=np.int32))])
数据框:
df = pd.DataFrame(data)
到 np.rec.array:
d2 = list(zip(df.f0.tolist(), df.f1.tolist()))
d2 = np.rec.array(d2)
最终的:
print(type(d2))
>>> <class 'numpy.recarray'>
推荐阅读
- laravel - 在 Laravel 中更改查询字符串的值
- after-effects - After Effects - 合成选项卡中的双视图
- php - PHP Laravel REST api 双反关系
- python-3.x - TensorFlow 2.0 - 学习率调度器
- typescript - 在 Typescript 中使用交集类型扩展函数参数类型
- kotlin - 使用 Kotlin Exposed 创建数据库后连接到数据库
- jquery - How to modify close button in submenu of jQuery mobile for initial opening
- java - 如果我按 Enter 或关闭键盘,EditText 会更改大小
- sip - Linphone opus 编解码器采样率
- components - 在 App 组件旁边动态呈现 Blazor 组件