logo
down
shadow

Sort strings in pandas


Sort strings in pandas

By : user3100227
Date : January 12 2021, 01:40 AM
around this issue One possible solution with natsort for get indices of sorting values and change of original DataFrame by loc:
code :
from natsort import index_natsorted, order_by_index

df2 = df.loc[order_by_index(df.index, index_natsorted(df['id']))]
df1 = df['id'].str.split('_', expand=True)
df1[[0,2,3]] = df1[[0,2,3]].astype(int)
df1[1] = pd.to_datetime(df1[1])

df2 = df.loc[df1.sort_values([0,1,2,3]).index]
print (df2)
                     id
0   1075_2016-06-01_0_1
5   1075_2016-06-01_1_1
6   1075_2016-06-01_1_2
1  1075_2016-06-01_10_1
2  1075_2016-06-01_10_2
3  1075_2016-06-01_11_1
4  1075_2016-06-01_11_2
f = lambda x: [int(x[0]), pd.to_datetime(x[1]), int(x[2]), int(x[3])]
df2 = df.iloc[df['id'].str.split('_').map(f).argsort()]
print (df2)
                     id
0   1075_2016-06-01_0_1
5   1075_2016-06-01_1_1
6   1075_2016-06-01_1_2
1  1075_2016-06-01_10_1
2  1075_2016-06-01_10_2
3  1075_2016-06-01_11_1
4  1075_2016-06-01_11_2


Share : facebook icon twitter icon
Pandas sort dataframe by column with strings and integers

Pandas sort dataframe by column with strings and integers


By : Reynan
Date : March 29 2020, 07:55 AM
may help you . One option would be to group the data frame by the data type of column a and then sort each group separately:
code :
df.groupby(df.a.apply(type) != str).apply(lambda g: g.sort('a')).reset_index(drop = True)
How to sort a pandas dataframe by a column that has both numbers and strings?

How to sort a pandas dataframe by a column that has both numbers and strings?


By : Nasser
Date : March 29 2020, 07:55 AM
seems to work fine I have a dataframe that looks like this , pd.to_numeric + sort_values + loc -
code :
df.loc[pd.to_numeric(df.col0, errors='coerce').sort_values().index]

        col0    col1  col2  col4
3         34  865665   296     0
4         56  865700   297     0
5        100  865628   292     5
1  '1ZE7999'  865545    20    20
2  'R022428'  865584   297     0
i = pd.to_numeric(df.col0, errors='coerce')
i

1      NaN
2      NaN
3     34.0
4     56.0
5    100.0
Name: col0, dtype: float64
j = i.sort_values()
j

3     34.0
4     56.0
5    100.0
1      NaN
2      NaN
Name: col0, dtype: float64
df.loc[j.index]

        col0    col1  col2  col4
3         34  865665   296     0
4         56  865700   297     0
5        100  865628   292     5
1  '1ZE7999'  865545    20    20
2  'R022428'  865584   297     0
df.reindex(index=j.index)

        col0    col1  col2  col4
3         34  865665   296     0
4         56  865700   297     0
5        100  865628   292     5
1  '1ZE7999'  865545    20    20
2  'R022428'  865584   297     0
df.loc[j.index].reset_index(drop=True)

        col0    col1  col2  col4
0         34  865665   296     0
1         56  865700   297     0
2        100  865628   292     5
3  '1ZE7999'  865545    20    20
4  'R022428'  865584   297     0
How to sort strings with numbers in Pandas?

How to sort strings with numbers in Pandas?


By : user1567874
Date : March 29 2020, 07:55 AM
will be helpful for those in need You can use sorted with a custom function to calculate the indices which would be sort an array (much like numpy.argsort). Then feed to pd.DataFrame.iloc:
code :
df = pd.DataFrame({'name': ['Paul', 'Jean', 'Robert', 'John'],
                   'status': ['ok', 'must read 20 more books',
                              'must read 3 more books', 'does not read any book yet']})

def sort_key(x):
    if x[1] == 'ok':
        return -1
    elif x[1] == 'does not read any book yet':
        return np.inf
    else:
        return int(x[1].split()[2])

idx = [idx for idx, _ in sorted(enumerate(df['status']), key=sort_key)]

df = df.iloc[idx, :]

print(df)

     name                      status
0    Paul                          ok
2  Robert      must read 3 more books
1    Jean     must read 20 more books
3    John  does not read any book yet
Sort Strings Containing Numbers and Delimeters in Pandas

Sort Strings Containing Numbers and Delimeters in Pandas


By : Kerli Low
Date : December 25 2020, 09:30 PM
I wish this helpful for you You need to convert your string-numbers to integer after splitting them at all your various characters. Use a tuple of int to sort:
You can do this f.e. like so:
code :
import pandas as pd
lis=[]

# mix up numbers / strings and values
for i in ['103','99','102','101']:
    for j in map(str,[10,2,34,4,5,1,22,21,3]):
        for k in map(str,[1,2,33,16,17]):
            lis.append(i+'_'+j+'-'+k)
df = pd.DataFrame(dict(Field=lis))

# split mixed up stuff using regex ('-' first so it does NOT denote a char-range)
# convert all remainders to int and make them a tuple to sort on (seperate column)
df["tup"] = df["Field"].str.split(r"[-_:]").apply(lambda x: tuple(map(int, x)))
# sort on seperate column
df = df.sort_values("tup")
print(df)
[180 rows x 1 columns]
        Field            tup
70     99_1-1     (99, 1, 1)
71     99_1-2     (99, 1, 2)
73    99_1-16    (99, 1, 16)
74    99_1-17    (99, 1, 17)
72    99_1-33    (99, 1, 33)
50     99_2-1     (99, 2, 1)
51     99_2-2     (99, 2, 2)
53    99_2-16    (99, 2, 16)
54    99_2-17    (99, 2, 17)
..        ...            ...
34  103_22-17  (103, 22, 17)
32  103_22-33  (103, 22, 33)
10   103_34-1   (103, 34, 1)
11   103_34-2   (103, 34, 2)
13  103_34-16  (103, 34, 16)
14  103_34-17  (103, 34, 17)
12  103_34-33  (103, 34, 33)

[180 rows x 2 columns]
         Field
0     103_10-1
1     103_10-2
2    103_10-33
3    103_10-16
4    103_10-17
5      103_2-1
..         ...
173  101_21-16
174  101_21-17
175    101_3-1
176    101_3-2
177   101_3-33
178   101_3-16
179   101_3-17
how to perform a groupby, sort, and concatenate strings in a pandas dataframe

how to perform a groupby, sort, and concatenate strings in a pandas dataframe


By : WillyG
Date : March 29 2020, 07:55 AM
Any of those help I have this pandas frame: , Looks like groupby() and aggregation:
code :
df.groupby(['PK', 'Source'], as_index=False).Text.agg(' '.join)
(df.sort_values('Line')
        .groupby(['PK', 'Source'], as_index=False)
        .Text.agg(' '.join)
)
   PK Source             Text
0   1      A  The quick brown
1   2      A       fox jumped
2   3      A    over the lazy
3   4      A           yellow
4   5      A         dogs sam
Related Posts Related Posts :
  • Tuning the hyperparameter with gridsearch results in overfitting
  • some coordinates that I extracted from geocoder in Python are not saving in the variable I created
  • 7C in cs circles- python Im not sure what is wrong with this yet
  • How to fix 'AttributeError: 'list' object has no attribute 'shape'' error in python with Tensorflow / Keras when loading
  • python - thread`s target is a method of an object
  • Retrieve Variable From Class
  • What is the reason for matplotlib for printing labels multiple times?
  • Why would people use ThreadPoolExecutor instead of direct function call?
  • When clear_widgets is called, it doesnt remove screens in ScreenManager
  • Python can't import function
  • Pieces doesn't stack after one loop on my connect4
  • How to change font size of all .docx document with python-docx
  • How to store a word with # in .cfg file
  • How to append dictionaries to a dictionary?
  • How can I scrape text within paragraph tag with some other tags then within the paragraph text?
  • Custom entity ruler with SpaCy did not return a match
  • Logging with two handlers - one to file and one to stderr
  • How to do pivot_table in dask with aggfunc 'min'?
  • This for loop displays only the last entry of the student record
  • How to split a string by a specific pattern in number of characters?
  • Python 3: how to scrape research results from a website using CSFR?
  • Setting the scoring parameter of RandomizedSeachCV to r2
  • How to send alert or message from view.py to template?
  • How to add qml ScatterSeries to existing qml defined ChartView?
  • Django + tox: Apps aren't loaded yet
  • My css and images arent showing in django
  • Probability mass function sum 2 dice roll?
  • Cannot call ubuntu 'ulimit' from python subprocess without using shell option
  • Dataframe Timestamp Filter for new/repeating value
  • Problem with clicking select2 dropdownlist in selenium
  • pandas dataframe masks to write values into new column
  • How to click on item in navigation bar on top of page using selenium python?
  • Add multiple EntityRuler with spaCy (ValueError: 'entity_ruler' already exists in pipeline)
  • error when replacing missing ')' using negative look ahead regex in python
  • Is there a way to remove specific strings from indexes using a for loop?
  • select multiple tags by position in beautifulSoup
  • pytest: getting AttributeError: 'CaptureFixture' object has no attribute 'readouterror' capturing stdout
  • Shipping PyGObject/GTK+ app on Windows with MingW
  • Python script to deduplicate lines in multiple files
  • How to prevent window and widgets in a pyqt5 application from changing size when the visibility of one widget is altered
  • How to draw stacked bar plot from df.groupby('feature')['label'].value_counts()
  • Python subprocess doesn't work without sleep
  • How can I adjust 'the time' in python with module Re
  • Join original np array with resulting np array in a form of dictionary? multidimensional array? etc?
  • Forcing labels on histograms in each individual graph in a figure
  • For an infinite dataset, is the data used in each epoch the same?
  • Is there a more efficent way to extend a string?
  • Is it possible to do this loop in a one-liner?
  • invalid literal for int() with base 10: - django
  • Why does my code print a value that I have not assigned as yet?
  • the collatz func in automate boring stuff with python
  • How to find all possible combinations of parameters and funtions
  • about backpropagation deep neural network in tensorflow
  • How do access my flask app hosted in docker?
  • Replace the sentence include some text with Python regex
  • Counting the most common element in a 2D List in Python
  • logout a user from the system using a function in python
  • mp4 metadata not found but exists
  • Django: QuerySet with ExpressionWrapper
  • Pandas string search in list of dicts
  • shadow
    Privacy Policy - Terms - Contact Us © festivalmusicasacra.org