Skip to main content

Salesforce

Module salesforce

Incubating

Important Capabilities

CapabilityStatusNotes
Data ProfilingOnly table level profiling is supported via profiling.enabled config field
Detect Deleted EntitiesNot supported yet
DomainsSupported via the domain config field
Platform InstanceCan be equivalent to Salesforce organization

Prerequisites

In order to ingest metadata from Salesforce, you will need:

  • Salesforce username, password, security token OR
  • Salesforce instance url and access token/session id (suitable for one-shot ingestion only, as access token typically expires after 2 hours of inactivity)

The account used to access Salesforce requires the following permissions for this integration to work:

  • View Setup and Configuration
  • View All Data

Integration Details

This plugin extracts Salesforce Standard and Custom Objects and their details (fields, record count, etc) from a Salesforce instance. Python library simple-salesforce is used for authenticating and calling Salesforce REST API to retrive details from Salesforce instance.

REST API Resources used in this integration

Concept Mapping

This ingestion source maps the following Source System Concepts to DataHub Concepts:

Source ConceptDataHub ConceptNotes
SalesforceData Platform
Standard ObjectDatasetsubtype "Standard Object"
Custom ObjectDatasetsubtype "Custom Object"

Caveats

  • This connector has only been tested with Salesforce Developer Edition.
  • This connector only supports table level profiling (Row and Column counts) as of now. Row counts are approximate as returned by Salesforce RecordCount REST API.
  • This integration does not support ingesting Salesforce External Objects

CLI based Ingestion

Install the Plugin

pip install 'acryl-datahub[salesforce]'

Starter Recipe

Check out the following recipe to get started with ingestion! See below for full configuration options.

For general pointers on writing and running a recipe, see our main recipe guide.

pipeline_name: my_salesforce_pipeline
source:
type: "salesforce"
config:
instance_url: "https://mydomain.my.salesforce.com/"
username: user@company
password: password_for_user
security_token: security_token_for_user
platform_instance: mydomain-dev-ed
domain:
sales:
allow:
- "Opportunity$"
- "Lead$"

object_pattern:
allow:
- "Account$"
- "Opportunity$"
- "Lead$"

sink:
type: "datahub-rest"
config:
server: "http://localhost:8080"

Config Details

Note that a . is used to denote nested fields in the YAML recipe.

View All Configuration Options
Field [Required]TypeDescriptionDefaultNotes
access_token [✅]stringAccess token for instance urlNone
auth [✅]EnumUSERNAME_PASSWORD
ingest_tags [✅]booleanIngest Tags from source. This will override Tags entered from UINone
instance_url [✅]stringSalesforce instance url. e.g. https://MyDomainName.my.salesforce.comNone
is_sandbox [✅]booleanConnect to Sandbox instance of your SalesforceNone
password [✅]stringPassword for Salesforce userNone
platform [✅]stringsalesforce
platform_instance [✅]stringThe instance of the platform that all assets produced by this recipe belong toNone
security_token [✅]stringSecurity token for Salesforce usernameNone
username [✅]stringSalesforce usernameNone
env [✅]stringThe environment that all assets produced by this connector belong toPROD
domain [✅]map(str,AllowDenyPattern)A class to store allow deny regexesNone
domain.key.allow [❓ (required if domain is set)]array(string)None
domain.key.deny [❓ (required if domain is set)]array(string)None
domain.key.ignoreCase [❓ (required if domain is set)]booleanWhether to ignore case sensitivity during pattern matching.True
object_pattern [✅]AllowDenyPatternRegex patterns for Salesforce objects to filter in ingestion.{'allow': ['.*'], 'deny': [], 'ignoreCase': True}
object_pattern.allow [❓ (required if object_pattern is set)]array(string)None
object_pattern.deny [❓ (required if object_pattern is set)]array(string)None
object_pattern.ignoreCase [❓ (required if object_pattern is set)]booleanWhether to ignore case sensitivity during pattern matching.True
profile_pattern [✅]AllowDenyPatternRegex patterns for profiles to filter in ingestion, allowed by the object_pattern.{'allow': ['.*'], 'deny': [], 'ignoreCase': True}
profile_pattern.allow [❓ (required if profile_pattern is set)]array(string)None
profile_pattern.deny [❓ (required if profile_pattern is set)]array(string)None
profile_pattern.ignoreCase [❓ (required if profile_pattern is set)]booleanWhether to ignore case sensitivity during pattern matching.True
profiling [✅]SalesforceProfilingConfig{'enabled': False}
profiling.enabled [❓ (required if profiling is set)]booleanWhether profiling should be done. Supports only table-level profiling at this stageNone

Code Coordinates

  • Class Name: datahub.ingestion.source.salesforce.SalesforceSource
  • Browse on GitHub

Questions

If you've got any questions on configuring ingestion for Salesforce, feel free to ping us on our Slack