A comprehensive, production-ready API service that provides Instagram posts and Google Reviews on-demand for Shopify apps and websites. Built with intelligent caching, rate-limit protection, and zero-configuration scraping.
🌐 Live API: https://scraper.capula.co 📚 Documentation: https://scraper.capula.co/docs 🛍️ Shopify Guide: https://scraper.capula.co/docs/shopify ⭐ Google Reviews Guide: https://scraper.capula.co/docs/reviews
This API provides 3 main endpoints that work together:
// 1. Get Instagram Photos (no videos/reels)
fetch('https://scraper.capula.co/api/scrape?username=pascuccicoffee&count=7&type=photos')
// 2. Get Instagram Reels Only
fetch('https://scraper.capula.co/api/scrape?username=pascuccicoffee&count=5&type=reels')
// 3. Get Google Reviews
fetch('https://scraper.capula.co/api/reviews?organizationId=0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d&count=10')
That's it! No API keys needed on your end. Everything is handled server-side.
/api/scrapeFetch Instagram posts (photos or reels) from any public Instagram username.
URL:
GET https://scraper.capula.co/api/scrape
Parameters:
| Parameter | Required | Type | Default | Values | Description |
|---|---|---|---|---|---|
username |
YES | string | - | Any Instagram username | Username without @ symbol |
count |
No | integer | 7 | 1-50 | Number of posts to return |
type |
No | string | photos |
photos or reels |
Type of content to fetch |
Examples:
# Get 7 photos from @pascuccicoffee
https://scraper.capula.co/api/scrape?username=pascuccicoffee&count=7&type=photos
# Get 10 reels from @nike
https://scraper.capula.co/api/scrape?username=nike&count=10&type=reels
# Get 5 photos (type defaults to photos)
https://scraper.capula.co/api/scrape?username=starbucks&count=5
Response Format (Photos):
{
"data": [
{
"media_url": "https://scraper.capula.co/media/pascuccicoffee_123456.webp",
"permalink": "https://www.instagram.com/p/ABC123/",
"timestamp": "2025-10-08T09:30:00Z",
"caption": "Fresh coffee this morning! #coffee",
"media_type": "photo"
}
],
"count": 7,
"username": "pascuccicoffee",
"type": "photos",
"cached": true,
"scraped_at": "2025-10-09T01:24:27Z",
"cache_expires_in_hours": 18.5
}
Response Format (Reels):
{
"data": [
{
"media_url": "https://scraper.capula.co/media/pascuccicoffee_123456.mp4",
"thumbnail_url": "https://scraper.capula.co/media/pascuccicoffee_123456_thumb.webp",
"permalink": "https://www.instagram.com/p/ABC123/",
"timestamp": "2025-10-08T09:30:00Z",
"caption": "Fresh coffee reel! #coffee",
"media_type": "reel"
}
],
"count": 7,
"username": "pascuccicoffee",
"type": "reels",
"cached": true,
"scraped_at": "2025-10-09T01:24:27Z",
"cache_expires_in_hours": 18.5
}
Response Fields:
| Field | Type | Description |
|---|---|---|
data |
array | Array of Instagram posts |
data[].media_url |
string | Direct URL to media file (WebP for photos, MP4 for reels) |
data[].thumbnail_url |
string | [Reels only] Direct URL to thumbnail/cover image (WebP, optimized) |
data[].permalink |
string | Link to original Instagram post |
data[].timestamp |
string | Publication date (ISO 8601) |
data[].caption |
string | Post caption (truncated to 200 chars) |
data[].media_type |
string | "photo" or "reel" |
count |
integer | Number of posts returned |
username |
string | Instagram username |
type |
string | Content type ("photos" or "reels") |
cached |
boolean | true = from cache (no API quota used), false = fresh scrape (1 API request) |
scraped_at |
string | When data was originally fetched |
cache_expires_in_hours |
float | Hours until cache expires and data refreshes |
Cache Duration: 24 hours per username + type combination
/api/reviewsFetch Google Reviews for any business using its Google Maps organization ID.
URL:
GET https://scraper.capula.co/api/reviews
Parameters:
| Parameter | Required | Type | Default | Description |
|---|---|---|---|---|
organizationId |
YES | string | - | Google Maps organization ID (e.g., 0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d) |
count |
No | integer | 10 | Number of reviews to return (1-50) |
Step-by-Step:
Example URL:
https://www.google.com/maps/place/Empire+State+Plumbing+Heating+%26+Air+Conditioning/@42.9354131,-73.8094899,17z/data=!3m1!4b1!4m6!3m5!1s0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d!8m2!3d42.9354131!4d-73.806715!16s%2Fg%2F11b6g2tcd1
1s followed by the ID:1s0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1s prefix:0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d ← This is your organizationId
Examples:
# Get 10 reviews for Empire State Plumbing
https://scraper.capula.co/api/reviews?organizationId=0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d&count=10
# Get 20 reviews
https://scraper.capula.co/api/reviews?organizationId=0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d&count=20
# Get maximum 50 reviews
https://scraper.capula.co/api/reviews?organizationId=0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d&count=50
Response Format:
{
"data": [
{
"rating": 5,
"comment": "Excellent service! Very professional and knowledgeable staff...",
"date": "2025-06-16T18:27:39.118Z",
"author": null,
"photos": [
"https://lh3.googleusercontent.com/..."
],
"owner_response": "Thank you so much for your feedback! We're thrilled..."
}
],
"count": 10,
"organizationId": "0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d",
"cached": true,
"scraped_at": "2025-10-09T01:13:45Z",
"cache_expires_in_days": 6.9
}
Response Fields:
| Field | Type | Description |
|---|---|---|
data |
array | Array of Google reviews |
data[].rating |
integer | Star rating (1-5) |
data[].comment |
string | Review text/comment (full text) |
data[].date |
string | Publication date (ISO 8601) |
data[].author |
string/null | Reviewer name (may be null for privacy) |
data[].photos |
array | URLs to review photos (empty if none) |
data[].owner_response |
string/null | Business owner's response (null if no response) |
count |
integer | Number of reviews returned |
organizationId |
string | The business organization ID |
cached |
boolean | true = from cache, false = fresh scrape |
scraped_at |
string | When data was originally fetched |
cache_expires_in_days |
float | Days until cache expires |
Cache Duration: 7 days per organizationId
| Endpoint | Method | Purpose | Cache | Parameters |
|---|---|---|---|---|
/api/scrape |
GET | Instagram photos/reels | 24h | username, count, type |
/api/reviews |
GET | Google Reviews | 7d | organizationId, count |
/api/search-business |
GET | Find organizationId helper | - | query |
/health |
GET | Health check | - | none |
| Endpoint | Method | Purpose |
|---|---|---|
/ |
GET | Main documentation (HTML) |
/docs |
GET | Main documentation (HTML) |
/docs/shopify |
GET | Shopify integration guide (HTML) |
/docs/reviews |
GET | Google Reviews guide (HTML) |
| Endpoint | Method | Purpose | Cache |
|---|---|---|---|
/media/{filename}.webp |
GET | Serve optimized images | 12h |
/feeds |
GET | List all cached feeds | - |
No authentication required for API consumers (Shopify apps, websites).
All authentication is handled server-side with RapidAPI. The API uses: - Instagram120 API (RapidAPI) - Google Maps API Unofficial (RapidAPI) - Single RapidAPI key configured on server
For developers/admins:
- Server-side RapidAPI key stored in .env file
- Same key works for both Instagram and Google Reviews APIs
Why 24 hours? - Instagram users post frequently (daily/multiple times per day) - Fresh content is important for engagement - Keeps data current without excessive API calls
How it works: 1. First Request: Fetch from Instagram API → Save to cache → Return data 2. Subsequent Requests (< 24h): Serve from cache (instant, no API quota) 3. After 24 Hours: Cache expires → Next request fetches fresh data
Cache Key: username + type (photos or reels)
Example:
- Request @nike photos at 10 AM → Fresh scrape (1 API request)
- Request @nike photos at 2 PM → Cached (0 API requests)
- Request @nike photos at 10 AM next day → Fresh scrape (1 API request)
- Request @nike reels at 11 AM → Fresh scrape (different type, 1 API request)
Why 7 days? - Reviews don't change frequently (businesses get reviews weekly, not daily) - Longer cache = better API quota management - Review data is less time-sensitive than social media
How it works: 1. First Request: Fetch from Google API → Save to cache → Return data 2. Subsequent Requests (< 7 days): Serve from cache (instant, no API quota) 3. After 7 Days: Cache expires → Next request fetches fresh data
Cache Key: organizationId
Example: - Request business reviews Monday → Fresh scrape (1 API request) - Request same business Tuesday-Sunday → Cached (0 API requests) - Request same business next Monday → Cached (0 API requests) - Request same business 8 days later → Fresh scrape (1 API request)
All cached data stored in: /app/data/
Instagram:
- ig_user_{username}.json (all posts)
- ig_user_{username}_photos.json (photos only)
- ig_user_{username}_reels.json (reels only)
Google Reviews:
- google_reviews_{safe_org_id}.json
Images:
- media/{username}_{post_id}.webp
Instagram120 API:
- 35 requests per day (free tier)
- Each unique username + type = 1 request per 24 hours
- Cached requests = 0 API quota
Google Maps API (Unofficial):
- Varies by plan (check RapidAPI dashboard)
- Each unique organizationId = 1 request per 7 days
- Cached requests = 0 API quota
With caching, you can serve: - ✅ Unlimited requests from Shopify/websites - ✅ Only 1 API call per username per day (Instagram) - ✅ Only 1 API call per business per week (Google Reviews)
Example Scenario: - 10 Shopify stores use the API - Each store requests 3 Instagram users + 1 Google Review business - Each store makes 100 requests per day
API Usage: - Instagram: 30 unique usernames = 30 API requests (within 35 limit ✅) - Google: 10 unique businesses = 10 API requests (for entire week ✅) - All other requests (999+ daily) = 0 API requests (cached ✅)
Instagram:
{
"error": "Missing parameter",
"message": "The \"username\" parameter is required",
"example": "/api/scrape?username=pascuccicoffee&count=7&type=photos"
}
Google Reviews:
{
"error": "Missing parameter",
"message": "The \"organizationId\" parameter is required",
"example": "/api/reviews?organizationId=0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d&count=10",
"help": "Find organizationId from Google Maps URL or use Google Maps Place ID"
}
{
"error": "Invalid type",
"message": "Type must be \"photos\" or \"reels\"",
"example": "/api/scrape?username=pascuccicoffee&count=7&type=photos"
}
{
"error": "Invalid count",
"message": "Count must be between 1 and 50",
"provided": 100
}
{
"error": "No posts found",
"message": "Could not fetch posts for @invaliduser123. Username may not exist or may be private.",
"username": "invaliduser123"
}
{
"error": "No reviews found",
"message": "Could not fetch reviews for organization 0xinvalidid",
"organizationId": "0xinvalidid"
}
{
"error": "Internal server error",
"message": "Detailed error message here..."
}
async function fetchInstagramPosts(username, count = 7, type = 'photos') {
try {
const response = await fetch(
`https://scraper.capula.co/api/scrape?username=${username}&count=${count}&type=${type}`
);
const data = await response.json();
// Check for errors
if (!response.ok) {
console.error(`Error ${response.status}:`, data.error);
console.error('Message:', data.message);
// Handle specific error cases
switch (response.status) {
case 400:
// Bad request - show user-friendly message
alert(`Invalid request: ${data.message}`);
break;
case 404:
// Not found - show fallback content
console.log('Username not found, showing fallback');
break;
case 500:
// Server error - retry later
console.log('Server error, will retry in 5 minutes');
setTimeout(() => fetchInstagramPosts(username, count, type), 300000);
break;
}
return null;
}
// Success - use the data
console.log(`Got ${data.count} ${type} from @${username}`);
console.log(`Cached: ${data.cached}`);
return data.data;
} catch (error) {
console.error('Network error:', error);
return null;
}
}
async function getInstagramPhotos(username, count = 7) {
const url = `https://scraper.capula.co/api/scrape?username=${username}&count=${count}&type=photos`;
const response = await fetch(url);
const data = await response.json();
if (response.ok) {
data.data.forEach(post => {
console.log(`Image: ${post.media_url}`);
console.log(`Caption: ${post.caption}`);
console.log(`Link: ${post.permalink}`);
});
return data.data;
}
return null;
}
// Usage
getInstagramPhotos('pascuccicoffee', 7);
async function getInstagramReels(username, count = 5) {
const url = `https://scraper.capula.co/api/scrape?username=${username}&count=${count}&type=reels`;
const response = await fetch(url);
const data = await response.json();
if (response.ok) {
data.data.forEach(reel => {
console.log(`Reel: ${reel.media_url}`);
console.log(`Caption: ${reel.caption}`);
});
return data.data;
}
return null;
}
// Usage
getInstagramReels('pascuccicoffee', 5);
async function getGoogleReviews(organizationId, count = 10) {
const url = `https://scraper.capula.co/api/reviews?organizationId=${organizationId}&count=${count}`;
const response = await fetch(url);
const data = await response.json();
if (response.ok) {
data.data.forEach(review => {
console.log(`⭐ ${review.rating}/5`);
console.log(`💬 ${review.comment}`);
if (review.owner_response) {
console.log(`💼 Owner: ${review.owner_response}`);
}
});
return data.data;
}
return null;
}
// Usage
getGoogleReviews('0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d', 10);
import requests
def get_instagram_photos(username, count=7):
url = 'https://scraper.capula.co/api/scrape'
params = {
'username': username,
'count': count,
'type': 'photos'
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = response.json()
for post in data['data']:
print(f"Image: {post['media_url']}")
print(f"Caption: {post['caption']}")
return data['data']
return None
def get_google_reviews(organization_id, count=10):
url = 'https://scraper.capula.co/api/reviews'
params = {
'organizationId': organization_id,
'count': count
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = response.json()
for review in data['data']:
print(f"⭐ {review['rating']}/5")
print(f"💬 {review['comment']}")
return data['data']
return None
# Usage
get_instagram_photos('pascuccicoffee', 7)
get_google_reviews('0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d', 10)
<?php
function getInstagramPhotos($username, $count = 7) {
$url = 'https://scraper.capula.co/api/scrape?' . http_build_query([
'username' => $username,
'count' => $count,
'type' => 'photos'
]);
$response = file_get_contents($url);
$data = json_decode($response, true);
foreach ($data['data'] as $post) {
echo "Image: {$post['media_url']}\n";
echo "Caption: {$post['caption']}\n";
}
return $data['data'];
}
function getGoogleReviews($organizationId, $count = 10) {
$url = 'https://scraper.capula.co/api/reviews?' . http_build_query([
'organizationId' => $organizationId,
'count' => $count
]);
$response = file_get_contents($url);
$data = json_decode($response, true);
foreach ($data['data'] as $review) {
echo "⭐ {$review['rating']}/5\n";
echo "💬 {$review['comment']}\n";
}
return $data['data'];
}
// Usage
getInstagramPhotos('pascuccicoffee', 7);
getGoogleReviews('0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d', 10);
?>
# Instagram Photos
curl "https://scraper.capula.co/api/scrape?username=pascuccicoffee&count=7&type=photos"
# Instagram Reels
curl "https://scraper.capula.co/api/scrape?username=pascuccicoffee&count=5&type=reels"
# Google Reviews
curl "https://scraper.capula.co/api/reviews?organizationId=0x89de0b0b3cdbe1d3%3A0x27519164cd8d3b5d&count=10"
# With jq for pretty JSON
curl -s "https://scraper.capula.co/api/reviews?organizationId=0x89de0b0b3cdbe1d3%3A0x27519164cd8d3b5d&count=10" | jq
Container: instagram_scraper
Port: 5050
Domain: https://scraper.capula.co
Reverse Proxy: Nginx Proxy Manager
Networks: instagram_scraper_default, my_shared_proxy_network
Restart Policy: always (auto-restart on crash)
version: '3.8'
services:
instagram-scraper:
build: .
container_name: instagram_scraper
ports:
- "5050:5050"
networks:
- default
- my_shared_proxy_network
environment:
- RAPIDAPI_KEY=${RAPIDAPI_KEY}
- INSTAGRAM_USERNAME_TARGET=${INSTAGRAM_USERNAME_TARGET:-pascuccicoffee}
- POST_COUNT=${POST_COUNT:-7}
- BASE_URL=${BASE_URL:-https://scraper.capula.co}
volumes:
- ./data:/app/data
- ./logs:/app/logs
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:5050/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
networks:
my_shared_proxy_network:
external: true
Required:
- RAPIDAPI_KEY - Your RapidAPI key (works for both APIs)
Optional:
- INSTAGRAM_USERNAME_TARGET - Default Instagram username (for cron)
- POST_COUNT - Default number of posts (for cron)
- BASE_URL - Your domain for absolute media URLs
Proxy Host Configuration:
Domain Names: scraper.capula.co
Scheme: http
Forward Hostname/IP: instagram_scraper
Forward Port: 5050
Block Common Exploits: Yes
Websockets Support: No
SSL:
- Force SSL: Yes
- HTTP/2 Support: Yes
- HSTS Enabled: Yes
- SSL Certificate: Let's Encrypt
# Start services
docker-compose up -d
# View logs
docker-compose logs -f
# Restart
docker-compose restart
# Rebuild (after code changes)
docker-compose down && docker-compose up -d --build
# Check health
curl https://scraper.capula.co/health
# Manual scrape (Instagram)
docker exec instagram_scraper python scraper.py
# View cached data
ls -lh /home/docker/instagram_scraper/data/
"Username not found" error:
- Verify username exists on Instagram
- Check if account is public (private accounts don't work)
- Try the username directly: https://instagram.com/{username}
No posts returned:
- Account may have no posts
- All posts might be videos (if requesting type=photos)
- Check response: "count": 0 indicates no matching content
Slow response times:
- First request: 5-10 seconds (fresh scrape from Instagram)
- Cached requests: <100ms (instant)
- Check "cached": false means it's fetching fresh data
"organizationId not found" error:
- Verify the organization ID format: 0xHEX:0xHEX
- Check Google Maps URL contains the ID
- Business must have Google My Business profile
No reviews returned: - Business may have no reviews - Reviews may be disabled - Check if business exists: Google Maps search
Wrong reviews shown: - Verify organizationId is correct - Check you copied the full ID (including both hex parts) - Ensure URL encoding if special characters
502 Bad Gateway:
- Container not running: docker ps | grep instagram
- Not on proxy network: docker network connect my_shared_proxy_network instagram_scraper
- Check Nginx Proxy Manager settings
503 Service Unavailable:
- Container unhealthy: docker logs instagram_scraper
- Restart: docker-compose restart
SSL Certificate Error: - Let's Encrypt not configured - Force SSL not enabled - Check Nginx Proxy Manager SSL settings
curl https://scraper.capula.co/health
Response:
{
"status": "healthy",
"timestamp": "2025-10-09T01:24:27Z"
}
Instagram (check cached field):
curl "https://scraper.capula.co/api/scrape?username=pascuccicoffee&count=1&type=photos" | jq '{cached, cache_expires_in_hours}'
Google Reviews (check cached field):
curl "https://scraper.capula.co/api/reviews?organizationId=0x89de0b0b3cdbe1d3:0x27519164cd8d3b5d&count=1" | jq '{cached, cache_expires_in_days}'
curl https://scraper.capula.co/feeds | jq
Detailed Guides: - Shopify Integration Guide - Complete Shopify app integration - Google Reviews Guide - Detailed Google Reviews documentation - Local Files - Markdown documentation files
API Reference:
- Health: GET /health
- Instagram: GET /api/scrape
- Google Reviews: GET /api/reviews
- Business Search Helper: GET /api/search-business
Response Times:
| Scenario | Google Reviews | |
|---|---|---|
| Cached (hit) | <100ms | <100ms |
| Fresh scrape | 5-10 sec | 5-10 sec |
| Not found | 3-5 sec | 3-5 sec |
Image Optimization: - Format: WebP (modern, compressed) - Max dimension: 1200px - Quality: 85% - Typical size: 50-200KB (vs 1-5MB original)
Planned features: - [ ] Twitter/X scraper - [ ] TikTok scraper - [ ] LinkedIn posts - [ ] Yelp reviews - [ ] Facebook reviews - [ ] Webhook notifications - [ ] Admin dashboard - [ ] Analytics/usage stats
Live Service: https://scraper.capula.co Health Check: https://scraper.capula.co/health Documentation: https://scraper.capula.co/docs
For Issues:
1. Check container logs: docker logs instagram_scraper
2. Verify API status: /health endpoint
3. Review RapidAPI dashboard for quota
4. Check network connectivity
MIT
Three simple endpoints. Zero configuration. Unlimited possibilities.
// Instagram Photos
fetch('https://scraper.capula.co/api/scrape?username=nike&count=10&type=photos')
// Instagram Reels
fetch('https://scraper.capula.co/api/scrape?username=nike&count=10&type=reels')
// Google Reviews
fetch('https://scraper.capula.co/api/reviews?organizationId=YOUR_ID&count=20')
Everything you need: - ✅ Automatic caching (24h Instagram, 7d Google Reviews) - ✅ Rate limit protection - ✅ Error handling - ✅ Optimized images - ✅ CORS enabled - ✅ SSL/HTTPS - ✅ Auto-restart on crash - ✅ Complete documentation
Start building today! 🚀