Interview Kickstart has enabled over 21000 engineers to uplevel.
If you’re a software developer working on Python, you must be aware of a common issue while working with dictionaries: the problem of trying to access/modify keys that are not present in the dictionary and getting a KeyError. Default dictionary, or defaultdict in Python, helps resolve this issue by assigning a default to any key that doesn’t have a value associated with it. This comes in handy when a new key’s value is non-existent. Coding engineers preparing for technical interviews Python-heavy software engineering roles would especially benefit from an in-depth understanding of the defaultdict type. In this article, we’ll see how defaultdict works in Python.
We’ll cover:
defaultdict is a data structure very similar to a dictionary, which is a part of Python's standard library under the module collections. Collections contains various specialized container types, and defaultdict, an efficient and elegant data structure, is one of them.
The sub-class defaultdict inherits from its parent class dict and returns a dictionary-like object. It is mainly useful for providing default values for missing keys, where each new key is assigned a default value, and thus getting a KeyError is always avoided. So if you find yourself stuck handling missing keys in a dictionary, consider using defaultdict, which is designed to solve that problem.
The first argument to defaultdict must be a callable that takes no arguments and returns a value. Valid callables include functions, methods, classes, and type objects. If access/modification to a non-existent key is attempted, this callable assigned to .default_factory helps get the default value for the non-existent key. The default value of .default_factory is None, and so if a callable is not passed as the first argument, defaultdict will behave like a dictionary and will show KeyError when handling missing keys, just as a dictionary would. In other words, passing no arguments or passing None will lead to the same behavior.
We create a defaultdict by declaring it and passing an argument, which can be an int, a list, or a set. This argument is called default_factory. Here, default_factory refers to a function that returns the default value for the dictionary. This argument is what causes the additional key feature of defaultdict as compared to a dictionary.
Syntax: defaultdict(default_factory)
In the presence of the default_factory argument, the default value is assigned if a non-existent key is being accessed/added and there’s no KeyError. In the absence of default_factory, a KeyError is raised by the dictionary. In fact, when there’s an attempt to modify or access a non-existent key, the key is actually created, and the default value is assigned to it, leaving no room for KeyError.
defaultdict can provide this edge over dictionaries because it overrides ._missing_(), which is called when a key can’t be found, and because it uses .default_factory(). As we know, the first argument passed into default_factory._init_() can be a valid Python callable or None and is stored in the instance variable .default_factory.
If the first argument is a callable, it’ll be called by default when the value of a non-existent key is being accessed/modified. This callable object takes no arguments and returns the default value, which will be used for non-existent keys. The remaining arguments, including keywords, are handled equivalent to how they would be, were they passed to a normal dict’s initializer.
Here’s the implementation of defaultdict:
# Creating defaultdict
#Importing defaultdict from collections
from collections import defaultdict
# Defining a defaultdict of type list
ddexample = defaultdict(list)
ddexample['key'].append(3)
ddexample['key'].append(2)
print("ddexample contains: ")
print(ddexample)
print("A non-existent key when accessed in ddexample contains: ")
print(ddexample['keynew'])
Output:
ddexample contains:
defaultdict(<class 'list'>, {'key': [3, 2]})
A non-existent key when accessed in ddexample contains:
[]
The difference between defaultdict and dict is mainly due to three methods that are unique in implementation to defaultdict. (The rest are the same for the sub-class defaultdict as they are for its parent dict.)
Also, dict contains the method .setdefault() to provide values for non-existent keys when needed. In defaultdict, we specify the default value initially, which can be faster than using .setdefault()
Let us look at some ways we can use default_factory. We’ll see their implementation after the end of this section with an example.
If we want to add flexibility and pass arguments to .default_factory(), we can use two Python tools:
Code:
# Creating defaultdict
#Importing defaultdict from collections
from collections import defaultdict
# Grouping Items: Using List as default_factory
print("Grouping Items: Using List as default_factory \n")
listdef = defaultdict(list)
listdef['ApplesInADay'].append(1)
listdef['ApplesInADay'].append(3)
print("Fruits had in a day for the past few days: ")
print(listdef)
#Accessing non-existing key
print("Oranges had in a day for the past few days: ")
print(listdef['OrangesInADay'])
print("Fruits had in a day for the past few days updated: ")
print(listdef)
# Grouping Unique Items: Using set as default_factory
print("Grouping Unique Items: Using set as default_factory\n")
setdef = defaultdict(set)
setdef['apples'].add('a')
setdef['apples'].add('p')
setdef['apples'].add('l')
setdef['apples'].add('e')
setdef['kiwi'].add('k')
setdef['kiwi'].add('i')
setdef['kiwi'].add('w')
print("Set of letters in different fruits: ")
print(setdef)
#Accessing non-existing key
print("Set of letters in oranges: ")
print(setdef['oranges'])
print("Set of letters in different fruits updated: ")
print(setdef)
# Counting Items: Using int as default_factory
print("Counting Items: Using int as default_factory\n")
intdef = defaultdict(int)
intdef['AppleBoxes']=3
intdef['OrangeBoxes']=1
print("Number of fruit boxes received today: ")
print(intdef)
#Accessing non-existing key
print("Number of Peach boxes received today: ")
print(intdef['PeachBoxes'])
print("Number of fruit boxes received today updated: ")
print(intdef)
# Using lambda as default_factory argument
print("Using lambda as default_factory argument\n")
lambdadef = defaultdict(lambda: "No Box Present")
lambdadef['AppleBoxes']=3
lambdadef['OrangeBoxes']=1
print("Number of fruit boxes received today: ")
print(lambdadef)
#Accessing non-existing key
print("Number of Peach boxes received today: ")
print(lambdadef['PeachBoxes'])
print("Number of fruit boxes received today updated: ")
print(lambdadef)
Output:
The output shown is using Python 3.x. Using Python 2.x will give the same output, stated slightly differently.
Grouping Items: Using List as default_factory
Fruits had in a day for the past few days:
defaultdict(<class 'list'>, {'ApplesInADay': [1, 3]})
Oranges had in a day for the past few days:
[]
Fruits had in a day for the past few days updated:
defaultdict(<class 'list'>, {'ApplesInADay': [1, 3], 'OrangesInADay': []})
Grouping Unique Items: Using set as default_factory
Set of letters in different fruits:
defaultdict(<class 'set'>, {'apples': {'a', 'e', 'p', 'l'}, 'kiwi': {'i', 'w', 'k'}})
Set of letters in oranges:
set()
Set of letters in different fruits updated:
defaultdict(<class 'set'>, {'apples': {'a', 'e', 'p', 'l'}, 'kiwi': {'i', 'w', 'k'}, 'oranges': set()})
Counting Items: Using int as default_factory
Number of fruit boxes received today:
defaultdict(<class 'int'>, {'AppleBoxes': 3, 'OrangeBoxes': 1})
Number of Peach boxes received today:
0
Number of fruit boxes received today updated:
defaultdict(<class 'int'>, {'AppleBoxes': 3, 'OrangeBoxes': 1, 'PeachBoxes': 0})
Using lambda as default_factory argument
Number of fruit boxes received today:
defaultdict(<function <lambda> at 0x104cce0d0>, {'AppleBoxes': 3, 'OrangeBoxes': 1})
Number of Peach boxes received today:
No Box Present
Number of fruit boxes received today updated:
defaultdict(<function <lambda> at 0x104cce0d0>, {'AppleBoxes': 3, 'OrangeBoxes': 1, 'PeachBoxes': 'No Box Present'})
Question 1: Can a defaultdict be equal to a dict?
Answer: Yes. If both defaultdict and dict store the exact same items, they’d be equal, and this can be tested by creating one of each with the same items and putting a condition which evaluates if they’re equal, which will evaluate to true. Also, if no argument is passed to defaultdict, it also works the same as a dictionary.
Question 2: Can we avoid getting a KeyError in a dict without using defaultdict?
Answer: Yes. Using mutable collections like list, set, or dict as values in dictionaries and initializing them before their first usage will successfully avoid getting a KeyError. However, using defaultdict automates this process and makes it easier, so it may be preferred.
If you’re looking for guidance and help with getting your prep started, sign up for our free webinar. As pioneers in the field of technical interview prep, we have trained thousands of software engineers to crack the toughest coding interviews and land jobs at their dream companies, such as Google, Facebook, Apple, Netflix, Amazon, and more!
-----------
Article contributed by Tanya Shrivastava
Attend our webinar on
"How to nail your next tech interview" and learn