Problem Statement
Faria Education Group's Public APIs are a valuable resource, however steps must be taken to ensure usage is optimized for both the client and server. This article will demonstrate the two best practices associated with retrieving large datasets: pagination and delta updates.
Related: Rate limitations and Throttling
Understanding Pagination
Pagination is a common technique in API design used to break up large query results into smaller, more manageable responses. Both the ManageBac and OpenApply APIs implement pagination to handle large collections of data efficiently, are are implemented on endpoints such as "Get all students" and "Get all parents." These are commonly referred to as list endpoints.
A common misunderstanding is that using one call to "Get all Students" retrieves all students in the response. Instead, the response only includes a certain number of items in the queried collection, starting by default at the beginning of the list. There are query parameters available that can change how many items are returned, and to return items starting in the middle of the collection instead of the beginning.
How Pagination Works
When using list endpoints, each response contains a meta
property, which provides information about:
-
The number of records returned per page
-
The total number of pages
This information helps determine how many additional requests are needed to retrieve all data.
Example Responses Without Pagination
ManageBac API:
GET https://api.managebac.com/v2/students
OpenApply API:
GET https://mydomain.openapply.com/api/v3/students
Above, no query parameters have been explicitly passed, and so defaults are used. The endpoint will respond with the first 100 or 200 records, starting from the beginning.
Implementing Pagination
To retrieve more results beyond the first page, you must include a pagination parameter in your request by appending ?page=
to the query:
GET https://api.openapply.com/api/v3/students?page=2
GET https://api.managebac.com/v2/students?page=2
If additional parameters are required, use &
to separate them. The per_page
(or count
) parameter can sometimes be used to increase the number of results per page. However, the maximum allowable value is subject to change and may be adjusted at any time to optimize server resources. Refer to the API documentation for current limits.
The following demonstrates the sequence of calls required to collect all of the student records:
GET https://api.managebac.com/v2/students?page=1&per_page=100
GET https://api.managebac.com/v2/students?page=2&per_page=100
GET https://api.managebac.com/v2/students?page=3&per_page=100
GET https://api.managebac.com/v2/students?page=4&per_page=100
GET https://api.openapply.com/v3/students?page=1&count=100
GET https://api.managebac.com/v3/students?page=2&count=100
GET https://api.managebac.com/v3/students?page=3&count=100
GET https://api.managebac.com/v3/students?page=4&count=100
The meta
property in responses indicates how many pages remain:
{
"students": [{
// ..
}],
"meta": {
"current_page": 1,
"next_page": 2,
"total_pages": 4,
"total_count": 342,
"per_page": 100
}
}
Understanding Delta Updates
There are many scenarios where you only need to retrieve new or modified records since your last request, and where delta queries can be used. For example, there may be a synchronization process running every day overnight. Instead of fetching all student records each time, only the student records that have since been updated can be fetched. This method significantly reduces the amount of data transferred, and results in scripts completing faster as well.
How Delta Queries Work:
-
Initial Data Fetch: Retrieve the full dataset and store the data into a database. For ManageBac, MBPY is available that does this automatically.
-
Subsequent Requests: Use the date the records were last fetched and pass it as a query parameter
modified_since
(in MB) orsince_date
(in OA) -
Handle Pagination: The responses will be spread out across pages, in which case the
page
query parameter needs to be utilized. -
Store the date: Persist the date, so that the next time delta updates are pulled, that will be passed to the list endpoint.
The following demonstrates the sequence of calls required to collect updated student records for MB and OA:
GET https://api.managebac.com/v2/students?page=1&per_page=100&modified_since=2025-04-01
GET https://api.managebac.com/v2/students?page=2&per_page=100&modified_since=2025-04-01
GET https://api.openapply.com/v3/students?page=1&count=100&since_date=2025-04-01
GET https://api.openaply.com/v3/students?page=2&count=100&since_date=2025-04-01
Only records that were updated since April 1st are returned, and only two pages are required to collect all of those records.
Understanding Filters
To improve efficiency, limit query results to only the necessary data. Both ManageBac and OpenApply offer additional parameters to filter records, such as status
for OpenApply for "Get all Students." These parameters help narrow down results and optimize API calls.
GET https://api.openapply.com/api/v3/students?count=100&status=pending&page=2
The filters that are available in list endpoints are displayed in the technical documentation for the given endpoint.
Understanding Field Filters
OpenApply V3 returns a large dataset for each student record, but sometimes not all data points are required. If your integration only requires a certain number of fields, further optimization can be achieved by using the fields
query parameter. This is discussed in this dedicated article on the topic.