Refining Testable Python Code: Best Practices for I/O Separation
Written on
A few weeks ago, I came across an enlightening talk by Brandon Rhodes that captivated me. I felt compelled to experiment with one of the examples he presented and document my insights.
A significant lesson I took away was the necessity of decoupling input/output (I/O) operations—such as network requests and database interactions—from the main logic of our code. This approach enhances both modularity and testability.
While I won’t dive into the specifics of Clean Architecture or Clean Code in this article, which are indeed valuable concepts, my focus is on practical, immediately applicable techniques.
Let’s start by examining an example directly lifted from Brandon Rhodes’ presentation.
Analyzing the find_definition Function
Consider a Python function named find_definition that processes data and makes HTTP requests to an external API.
import requests # Listing 1 from urllib.parse import urlencode
def find_definition(word):
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
response = requests.get(url) # I/O
data = response.json() # I/O
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')return definition
Creating Our Initial Test
To test the find_definition function, we can use Python's built-in unittest module. Below is an example of how we might approach it:
import unittest from unittest.mock import patch
class TestFindDefinition(unittest.TestCase):
@patch('requests.get')
def test_find_definition(self, mock_get):
mock_response = {u'Definition': 'Visit tournacat.com'}
mock_get.return_value.json.return_value = mock_response
expected_definition = 'Visit tournacat.com'
definition = find_definition('tournacat')
self.assertEqual(definition, expected_definition)
mock_get.assert_called_with('http://api.duckduckgo.com/?q=define+tournacat&format=json')
Using the patch decorator from the unittest.mock module allows us to simulate the behavior of the requests.get function, providing controlled responses during testing. This enables us to test the find_definition function in isolation without making actual HTTP requests.
Challenges with Testing and Tight Coupling
While mocking with the patch decorator, we inadvertently create a tight coupling between our tests and the internal implementation of the function. This can lead to fragile tests that may break if there are modifications to the implementation or its dependencies.
If the find_definition function changes, such as:
- Switching to a different HTTP library
- Altering the structure of the API's response
- Modifying the API endpoint
We would need to update our tests accordingly, making the task of maintaining unit tests for find_definition quite cumbersome.
Concealing I/O: A Common Pitfall
Often, when working with functions like find_definition that involve I/O, I would refactor the code to extract I/O operations into a separate function, such as call_json_api, as shown in the updated example below (also sourced from Brandon’s slides):
def find_definition(word): # Listing 2
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
data = call_json_api(url)
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')return definition
def call_json_api(url):
response = requests.get(url) # I/O
data = response.json() # I/O
return data
By isolating the I/O operations into a dedicated function, we enhance abstraction and encapsulation. The find_definition function now delegates the task of making the HTTP request and parsing the JSON response to call_json_api.
Updating the Test
We can again utilize the patch decorator to mock the behavior of the call_json_api function instead of requests.get. This allows us to control the response received by find_definition during testing.
import unittest from unittest.mock import patch
class TestFindDefinition(unittest.TestCase):
@patch('call_json_api')
def test_find_definition(self, mock_call_json_api):
mock_response = {u'Definition': 'Visit tournacat.com'}
mock_call_json_api.return_value = mock_response
expected_definition = 'Visit tournacat.com'
definition = find_definition('tournacat')
self.assertEqual(definition, expected_definition)
mock_call_json_api.assert_called_with('http://api.duckduckgo.com/?q=define+tournacat&format=json')
“Have We Truly Decoupled I/O?”
While we have concealed the I/O operations behind call_json_api, it's crucial to note that we haven't fully decoupled them. The find_definition function still relies on call_json_api and assumes it will perform I/O operations correctly.
Dependency Injection: Achieving Decoupling
We can further separate I/O operations using dependency injection, leading to a more decoupled design. Here’s an updated version of find_definition:
import requests
def find_definition(word, api_client=requests): # Dependency injection
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
response = api_client.get(url) # I/O
data = response.json() # I/O
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')return definition
The api_client parameter represents the dependency responsible for API calls. By default, it is set to requests, allowing us to use it for I/O operations.
Unit Testing with Dependency Injection
Dependency injection enhances control and predictability in unit testing. Below is an example of how we can write tests for find_definition with this approach:
import unittest from unittest.mock import MagicMock
class TestFindDefinition(unittest.TestCase):
def test_find_definition(self):
mock_response = {u'Definition': u'How to add Esports schedules to Google Calendar?'}
mock_api_client = MagicMock()
mock_api_client.get.return_value.json.return_value = mock_response
word = 'example'
expected_definition = 'How to add Esports schedules to Google Calendar?'
definition = find_definition(word, api_client=mock_api_client)
self.assertEqual(definition, expected_definition)
mock_api_client.get.assert_called_once_with('http://api.duckduckgo.com/?q=define+example&format=json')
In this updated unit test, we create a mock API client using the MagicMock class, configuring it to return a predefined response when its get method is invoked.
Challenges with Dependency Injection
Though dependency injection presents numerous advantages, it also poses challenges. As Brandon pointed out, consider these potential issues:
- Mock vs. Real Library: Mock objects may not fully mimic the behavior of real dependencies, leading to differences between test outcomes and actual runtime behavior.
- Complex Dependencies: Functions with multiple dependencies, like those involving databases, filesystems, and external services, can complicate injection setup and management.
Separating I/O Operations from Core Logic
To cultivate flexible and testable code, we can adopt an alternative strategy that doesn't rely on explicit dependency injection.
We can achieve a clear separation of concerns by placing I/O operations at the outermost layer of our code. Below is an example illustrating this concept:
def find_definition(word): # Listing 3
url = build_url(word)
data = requests.get(url).json() # I/O
return pluck_definition(data)
Here, the find_definition function focuses exclusively on the core logic of retrieving the definition from the received data. The I/O operations, like making the HTTP request and obtaining the JSON response, occur at the outer layer.
Additionally, the find_definition function depends on two auxiliary functions:
- The build_url function constructs the API request URL.
- The pluck_definition function extracts the definition from the API response.
Here are the relevant code snippets:
def build_url(word):
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
return url
def pluck_definition(data):
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')return definition
By positioning I/O at the outermost layer, the code gains flexibility. We can create functions that are individually testable and replaceable as necessary.
For instance, one could easily modify the build_url function to switch to a different API endpoint or manage alternative error scenarios within the pluck_definition function.
This separation of concerns allows us to adjust the I/O layer without affecting the core functionality of find_definition, thereby improving the maintainability and adaptability of the codebase.
Updating Unit Tests (Again)
To showcase the enhanced flexibility and control provided by the modular design, let’s refresh our unit tests for the find_definition function.
Here’s the revised code snippet:
import unittest from unittest.mock import patch
class TestFindDefinition(unittest.TestCase):
@patch('requests.get')
def test_find_definition(self, mock_get):
mock_response = {'Definition': 'Visit tournacat.com'}
mock_get.return_value.json.return_value = mock_response
word = 'example'
expected_definition = 'Visit tournacat.com'
definition = find_definition(word)
self.assertEqual(definition, expected_definition)
mock_get.assert_called_once_with(build_url(word))
def test_build_url(self):
word = 'example'
expected_url = 'http://api.duckduckgo.com/?q=define+example&format=json'
url = build_url(word)
self.assertEqual(url, expected_url)
def test_pluck_definition(self):
mock_response = {'Definition': 'What does tournacat.com do?'}
expected_definition = 'What does tournacat.com do?'
definition = pluck_definition(mock_response)
self.assertEqual(definition, expected_definition)
if __name__ == '__main__':
unittest.main()
In the revamped unit tests, we now possess distinct test methods for each modular component:
- test_find_definition checks the correct operation of the find_definition function, asserting that requests.get is called with the URL generated by build_url, highlighting the updated interaction between components.
- test_build_url ensures the build_url function accurately constructs the URL based on the provided word.
- test_pluck_definition verifies that the pluck_definition function correctly extracts the definition from the supplied data.
By refreshing our unit tests, we can now assess each component independently, confirming their functionality in isolation.
Conclusion
In summary, we have explored various approaches to refactoring to mitigate tight coupling and attain loose coupling among components. Additionally, we have observed how unit testing can be improved through the mocking of I/O operations and the management of external dependencies.
By positioning I/O operations at the outermost layer of our code, we achieve a clear separation of concerns, bolstering the modularity and maintainability of our codebase.