gotchapythonCritical
Difference between map, applymap and apply methods in Pandas
Viewed 0 times
mappandasmethodsdifferenceandbetweenapplymapapply
Problem
Can you tell me when to use these vectorization methods with basic examples?
I see that
I see that
map is a Series method whereas the rest are DataFrame methods. I got confused about apply and applymap methods though. Why do we have two methods for applying a function to a DataFrame? Again, simple examples which illustrate the usage would be great!Solution
Comparing
The major differences are:
Definition
Input argument
Behavior
Use case (the most important difference)
-
-
-
Also see When should I (not) want to use pandas apply() in my code? for a writeup I made a while back on the most appropriate scenarios for using
Summarising
Defined on Series?
Yes
No
Yes
Defined on DataFrame?
No
Yes
Yes
Argument
callable2
callable
Elementwise?
Yes
Yes
Yes
Aggregation?
No
No
Yes
Use Case
Transformation/mapping3
Transformation
More complex functions
Returns
scalar,
Footnotes
-
-
-
-
map, applymap and apply: Context MattersThe major differences are:
Definition
mapis defined on Series only
applymapis defined on DataFrames only
applyis defined on both
Input argument
mapacceptsdict,Series, or callable
applymapandapplyaccept callable only
Behavior
mapis elementwise for Series
applymapis elementwise for DataFrames
applyalso works elementwise but is suited to more complex operations and aggregation. The behaviour and return value depends on the function.
Use case (the most important difference)
-
map is meant for mapping values from one domain to another, so is optimised for performance, e.g.,df['A'].map({1:'a', 2:'b', 3:'c'})-
applymap is good for elementwise transformations across multiple rows/columns, e.g.,df[['A', 'B', 'C']].applymap(str.strip)-
apply is for applying any function that cannot be vectorised, e.g.,df['sentences'].apply(nltk.sent_tokenize)Also see When should I (not) want to use pandas apply() in my code? for a writeup I made a while back on the most appropriate scenarios for using
apply. (Note that there aren't many, but there are a few— apply is generally slow.)Summarising
mapapplymapapplyDefined on Series?
Yes
No
Yes
Defined on DataFrame?
No
Yes
Yes
Argument
dict, Series, or callable1callable2
callable
Elementwise?
Yes
Yes
Yes
Aggregation?
No
No
Yes
Use Case
Transformation/mapping3
Transformation
More complex functions
Returns
SeriesDataFramescalar,
Series, or DataFrame4Footnotes
-
map when passed a dictionary/Series will map elements based on the keys in that dictionary/Series. Missing values will be recorded as NaN in the output.-
applymap in more recent versions has been optimised for some operations. You will find applymap slightly faster than apply in some cases. My suggestion is to test them both and use whatever works better.-
map is optimised for elementwise mappings and transformation. Operations that involve dictionaries or Series will enable pandas to use faster code paths for better performance.-
Series.apply returns a scalar for aggregating operations, Series otherwise. Similarly for DataFrame.apply. Note that apply also has fastpaths when called with certain NumPy functions such as mean, sum, etc.Code Snippets
df['A'].map({1:'a', 2:'b', 3:'c'})df[['A', 'B', 'C']].applymap(str.strip)df['sentences'].apply(nltk.sent_tokenize)Context
Stack Overflow Q#19798153, score: 372
Revisions (0)
No revisions yet.