Step by Step guide LinkedIn API with Python – extract LinkedIn data using python

How to call LinkedIn API using python? How to extract LinkedIn campaign data using LinkedIn API and python?

Hi Everyone, hope you are keeping well. Today we are going to see how to develop a python code using LinkedIn API (LinkedIn Marketing API) to extract LinkedIn campaigns and profile data. Building Applications having integration with social networks are increasingly becoming popular for various purposes. Extracting data from social channels is the key part of it. Hold tight you are about to learn using python with LinkedIn API.

Whatever may be the purpose, connecting to social media requires your application to go through certain authentication and authorization process (mostly OAuth Authentication). For that, you need to go through the basic setup and generate a list of credentials. The credentials you going to need are – Access Token, App Id, App Secret, and Account Id. If you have it then you are good to go. In case you don’t have those credentials, no worry, you need to go through just 3 simple steps to set up a LinkedIn Developer account and generate credentials.

Code Structure:-

Python Code Structure for calling LinkedIn API

List of files in this project:-

  • ln_main.py
  • get_ln_campaign_data.py
  • ln_cred.json
  • campaign_category.json

Without boring you off from a long setup intro, let’s have a look at how you can run a python code to connect to LinkedIn through Linkedin API (LinkedIn Marketing API) and pull your profile data or analytics data (campaigns and ads data).

1. Create a JSON file for Credentials

A JSON file is the best way to store any credentials required for your automation code. It enables easy scalability and flexibility for maintenance and updates. 

Create a JSON file “ln_cred.json”.

Sample JSON file structure is shown below, which you can customize according to your need.

Note: I used client_id so that I can have the same JSON file for multiple clients and for easy scalability across clients.

{
    "access_token":"Replace it with LinkedIn access token",
    "client_id":"Replace it with LinkedIn app ID",
    "Client_secret":"Replace it with LinkedIn app secret",
    "account_id":"Replace this with LinkedIn ads account ID"
}

2. Create a Main Python file (The executable file – controller/initiator)

This python file will be used for flow control (i.e you will be executing this file). Consider this file as the heart of this LinkedIn campaign data extraction project. Save this file as “ln_main.py”.

#!/usr/local/bin/python3
# command to run this code $ python3 ./ln_main.py -s 2022-06-03 -e 2022-06-03
import getopt
import sys
import datetime
import os.path
import json

from get_ln_campaign_data import *

def isFile(fileName):
	if(not os.path.isfile(fileName)):
		raise ValueError("You must provide a valid filename as parameter")

def readfile(argv):
    global s_date
    global e_date
    global qry_type
    try:
        opts, args = getopt.getopt(argv,"s:e:")
    except getopt.GetoptError:
        usage()
    for opt, arg in opts:
        if opt == '-s':
            s_date = arg
        elif opt == '-e':
            e_date = arg
        else:
            print("Invalid Option in command line")


if __name__ == '__main__':
    try:
        timestamp = datetime.datetime.strftime(datetime.datetime.now(),'%Y-%m-%d : %H:%M')
        print("DATE : ",timestamp,"\n")
        print("LinkedIn data extraction process Started")
        readfile(sys.argv[1:])
        
        #reading LinkedIn credential json file
        cred_file = open("./ln_cred.json", 'r')
        cred_json = json.load(cred_file)
        
        #reading campaign type json file
        campaign_type_file = open("./campaign_category.json", 'r')
        camapign_type_json = json.load(campaign_type_file)

        access_token = cred_json["access_token"]
        account_id = cred_json["account_id"]

        #call the LinkedIn API query function (i.e get_linkedin_campaign_data)
        ln_campaign_df = get_LinkedIn_campaigns_list(access_token,account_id,camapign_type_json)
        print("LinkedIn Data :\n",ln_campaign_df)

        if not ln_campaign_df.empty:
            #get campaign analytics data
            campaign_ids = ln_campaign_df["campaign_id"]
            campaign_analytics = get_LinkedIn_campaign(access_token,campaign_ids,s_date,e_date)
            print("\n Campaigns analytics :\n",campaign_analytics)
        else:
            campaign_analytics = pd.DataFrame()

        print("\n LinkedIn data extraction Process Finished \n")
    except:
        print("\n LinkedIn data extraction processing Failed ", sys.exc_info())

3. Creating a Data Extraction Process Python file

This python file will do the most important work. Extracting data from LinkedIn Campaigns and LinkedIn Ads using LinkedIn Marketing API. Save the file with the name “get_ln_campaign_data.py”. Hence this file acts as the brain of this project.

In the below code I am getting only limited fields(metrics) from a LinkedIn Campaign. You can add more fields from the LinkedIn Metrics list in below code to get more data.

#!/usr/bin/python3
import requests
import pandas
import sys
import json
from datetime import datetime, timedelta
import datetime
import re
from urllib import parse
 
#Function for date validation
def date_validation(date_text):
    try:
        while date_text != datetime.datetime.strptime(date_text, '%Y-%m-%d').strftime('%Y-%m-%d'):
            date_text = input('Please Enter the date in YYYY-MM-DD format\t')
        else:
            return datetime.datetime.strptime(date_text,'%Y-%m-%d')
    except:
        raise Exception('linkedin_campaign_processing : year does not match format yyyy-mm-dd')
 
def get_LinkedIn_campaigns_list(access_token,account,camapign_type_json):
    try:
        url = "https://api.linkedin.com/v2/adCampaignsV2?q=search&search.account.values[0]=urn:li:sponsoredAccount:"+account
 
        headers = {"Authorization": "Bearer "+access_token}
        #make the http call
        r = requests.get(url = url, headers = headers)
        #defining the dataframe
        campaign_data_df = pandas.DataFrame(columns=["campaign_name","campaign_id","campaign_account",
                            "daily_budget","unit_cost","objective_type","campaign_status","campaign_type"])
 
        if r.status_code != 200:
            print("get_linkedIn_campaigns_list function : something went wrong :",r)
        else:
            response_dict = json.loads(r.text)
            #print(response_dict)
            if "elements" in response_dict:
                campaigns = response_dict["elements"]
                print("\nTotal number of campain in account : ",len(campaigns))
                #loop over each campaigns in the account
                for campaign in campaigns:
                    tmp_dict = {}
                    #for each campign check the status; ignor DRAFT campaign
                    if "status" in campaign and campaign["status"]!="DRAFT":
                        try:
                            campaign_name = campaign["name"]
                        except:
                            campaign_name = "NA"
                        tmp_dict["campaign_name"] = campaign_name
                        
                        try:
                            campaign_id = campaign["id"]
                        except:
                            campaign_id = "NA"
                        tmp_dict["campaign_id"] = campaign_id
                        
                        try:
                            campaign_acct = campaign["account"]
                            campaign_acct = re.findall(r'\d+',campaign_acct)[0]
                        except:
                            campaign_acct = "NA"
                        tmp_dict["campaign_account"] = campaign_acct
                        
                        try:
                            daily_budget = campaign["dailyBudget"]["amount"]
                        except:
                            daily_budget = None
                        tmp_dict["daily_budget"] = daily_budget
 
                        try:
                            unit_cost = campaign["unitCost"]["amount"]
                        except:
                            unit_cost = None
                        tmp_dict["unit_cost"] = unit_cost
 
                        try:
                            campaign_obj = campaign["objectiveType"]
                            if campaign_obj in camapign_type_json["off_site"]:
                                tmp_dict["campaign_type"] = "off_site"
                            elif campaign_obj in camapign_type_json["on_site"]:
                                tmp_dict["campaign_type"] = "on_site"
                            else:
                                print(" ### campaign ObjectiveType doesent match CampaignType references ###")
                        except:
                            campaign_obj = None
                            pass
                        tmp_dict["objective_type"] = campaign_obj
 
                        campaign_status = campaign["status"]
                        tmp_dict["campaign_status"] = campaign_status
                    
                        campaign_data_df = campaign_data_df.append(tmp_dict,ignore_index = True)
                try:
                    campaign_data_df["daily_budget"] = pandas.to_numeric(campaign_data_df["daily_budget"])
                    campaign_data_df["unit_cost"] = pandas.to_numeric(campaign_data_df["unit_cost"])
                except:
                    pass
            else:
                print("\nkey *elements* nmissing in JSON data from LinkedIn")
 
            return campaign_data_df
    except:
        print("get_linked_campaigns_list Failed :",sys.exc_info())
 
def get_LinkedIn_campaign(access_token,campaigns_ids,s_date,e_date,qry_type):
    try:
        #calling date validation funtion for start_date format check
        startDate = date_validation(s_date)
        dt = startDate+timedelta(1)
        week_number = dt.isocalendar()[1]
        #calling date validation funtion for end_date format check
        endDate = date_validation(e_date)
        #defining the dataframe
        campaign_analytics_data = pandas.DataFrame(columns=["campaign_id","start_date","end_date",
                                    "cost_in_usd","impressions","clicks"])
 
        for cmp_id in campaigns_ids:
            #Building api query in form of url 
            dateRange_start = "dateRange.start.day="+str(startDate.day)+"&dateRange.start.month="+str(startDate.month)+"&dateRange.start.year="+str(startDate.year)
            dateRange_end = "dateRange.end.day="+str(endDate.day)+"&dateRange.end.month="+str(endDate.month)+"&dateRange.end.year="+str(endDate.year)
            
            url = "https://api.linkedin.com/v2/adAnalyticsV2?q=analytics&pivot=CAMPAIGN&"+dateRange_start+"&"+dateRange_end+"&timeGranularity=ALL&campaigns[0]=urn:li:sponsoredCampaign:"+str(cmp_id)
            #defining header for authentication
            headers = {"Authorization": "Bearer "+access_token}
            #make the http call
            r = requests.get(url = url, headers = headers)
 
            if r.status_code != 200:
                print("*get_LinkedIn_campaign : something went wrong :",r)
            else:
                response_dict = json.loads(r.text)
                if "elements" in response_dict:
                    campaigns = response_dict["elements"]
                    for campaign in campaigns:
                        tmp_dict = {}
 
                        cmp_costInUsd = campaign["costInUsd"]
                        tmp_dict["cost_in_usd"] = cmp_costInUsd
 
                        cmp_impressions = campaign["impressions"]
                        tmp_dict["impressions"] = cmp_impressions
 
                        cmp_clicks = campaign["clicks"]
                        tmp_dict["clicks"] = cmp_clicks
                        
                        campaign_analytics_data = campaign_analytics_data.append(tmp_dict,ignore_index = True)
                        campaign_analytics_data["campaign_id"] = cmp_id
                        campaign_analytics_data["start_date"] = startDate
                        campaign_analytics_data["end_date"] = endDate
                        
                        if qry_type in ["week","weekly"]:
                            campaign_analytics_data["week"] = week_number
                        elif qry_type in ["month","monthly"]:
                            campaign_analytics_data["month"] = startDate.month
 
                    campaign_analytics_data["cost_in_usd"] = pandas.to_numeric(campaign_analytics_data["cost_in_usd"])
                else:
                    print("\nkey *elements* nmissing in JSON data from LinkedIn")
        
        return campaign_analytics_data
    except:
        print("\n*get_linked_campaigns_analytics Failed :",sys.exc_info())

5. Creating Campaign Category JSON file

This file has a list of all different types of LinkedIn campaigns and is categorized into off-site and on-site campaigns. This is a JSON file i.e key: value structure. So we have two keys “off-site” and “on-site”.

Save this file as “campaign_category.json”.

{
    "off_site":["VIDEO_VIEW","LEAD_GENERATION","BRAND_AWARENESS","CREATIVE_ENGAGEMENT"],
    "on_site":["WEBSITE_VISIT","WEBSITE_CONVERSION","JOB_APPLICANT","ENGAGEMENT","WEBSITE_TRAFFIC"]
}
Tips:-
  1. Store JSON files in separate folders say “config” and python files in separate folders say “source” for better structuring and easy maintenance.
  2. Don’t forget to update LinkedIn marketing API access token, since it expires after 2 months i.e valid only for 2 months.
  3. For database storing, create a separate file for only the data load process. And since we have used pandas data frame, data load to database table is easy. In some cases it may need a few additional steps, to refine data and its column.

Congratulation! you have successfully developed python code to extract LinkedIn campaigns and ads data. You can also look at python code to extract LinkedIn Profile data (i.e Linkedin Account Details).

Hope I was able to solve the problem. If you like this article and think it was easy to understand do share it with your friends and connection. Thank you! see you soon.

For any suggestions or doubts ~ Get In Touch

Checkout out my other API Integration and Coding Solution Guide