Skip to content

Aviatrix Webhooks with Azure Function

With MCNA (MultiCloud Network Architecture) we have very simple topology and design. Three Key component are there:

  • Controller – our brain
  • Gateways – our muscles handling all traffic 
  • CoPilot – eyes giving us insight into what is happening there

Lets focus today on monitoring part and CoPilot. Its main purpose is to monitor the health of our infrastructure and provide us analytic data of traffic flows. It has many cool features there which could be covered in a few separate posts but one of the feature which is not used that often, in my opinion, is webhook integration. 

Webhook – what is it?

It is a provider – consumer model of communication based on HTTP protocol. Here our provider is CoPilot, consumer … sky is the limit 🙂 You can leverage it in a many different ways:

  • integrate it with ticketing system (ServiceNow)
  • integrate it with TEAMS for NOC notification
  • send to remote NMS (Network Management Software)


It is HTTP, can be HTTPS – so it is safer than SNMPv2

As mentioned at the beginning CoPilot sees the health of our Aviatrix Infrastructure. It knows when one of the AVX GW goes down. Why not utilizing it for something more? Lets redesign our infrastructure on an event of … GATEWAY going DOWN.

pkonitz-webhook
pkonitz-demo - webhook 2
pkonitz-demo - webhook 3
pkonitz-demo - webhook 4
pkonitz-demo - webhook 5
pkonitz-demo - webhook 6
pkonitz-demo - webhook 7
previous arrow
next arrow
Shadow
  1. Our critical PING (the most important protocol for network engineer) is normally going via MCNA infrastructure (from VM4-spoke3 via SpGW to TrGW and to SpGW on VNET spoke1-az)
  2. FAILURE – our SpGW goes DOWN on VNET: spoke-3-az
  3. Few seconds later CoPilot is aware of that and triggers HTTP event to AZURE function 
  4. Azure Function runs its code 
  5. Native peering is created
  6. Traffic again is able to reach its destination 
  7. SpGW is restored. CoPilot is again aware of that. Triggers another event to Azure Function which does a cleanup (removes native peering)

CoPilot

Configuration on copilot is very simple …
First we need to define Webhook itself under Settings. It uses templates so you can customize it to your own needs and also with Preview function you can easily see the expected result.

Then we need to define condition to trigger it:

When event for GW failure is received we are getting an Alert. When it is resolved Alert changes it’s status to Closed.

Azure

This post is not about explaining how to define function on Azure side. There are plenty of tutorials there. It was my first one actually and I was able to do it :). One tip though – Visual studio code makes it much simpler. You just type it locally and publish it directly on Azure side.

Code …

I’m not a professional programmer and that piece of code definitely can be improved but works for me 🙂

import logging
import json
import requests
from urllib3.exceptions import InsecureRequestWarning
import azure.functions as func
from azure.identity import AzureCliCredential
from azure.identity import DefaultAzureCredential
from azure.mgmt.resource import ResourceManagementClient
import azure.common
import os
from azure.mgmt.network import NetworkManagementClient
from azure.mgmt.network.models import SubResource
from azure.mgmt.network.models import VirtualNetworkPeering

async def main(req: func.HttpRequest) -> func.HttpResponse:
    # we created in the Vault a Key called "subscription" -> holds the ID
    # then in the App Registry (function container) -> Configure -> Env Variable AZURE_SUBSCRIPTION_ID = takes value from vault
    # the Function itself runs under System -> Role -> Contributor on the whole Subscription for this Test
    """
    To test go to CoPilot:
    {
    "alert": "CLOSED",  --> change between OPEN and CLOSED
    "gateways": ["spoke1","GW2"],
    "affectedGW": ["spoke1"],
    "recoveredGW": ["spoke1"]
    }
    """
    var_header = {}
    var_tracked_gw = "avx-spoke-3-az-spoke"
    var_tracked_gw_RG = "rg-av-avx-spoke-3-az-spoke-241543"
    var_tracked_gw_RGid = "fb06971f-2809-43a7-8246-99a2a9f51dd9" #spoke3 vnet ID (Resource GUID)
    var_transit_gw = "avx-trgw-1-firenet"
    var_controller_ip = "1.2.3.4"   # change to you Controller's IP
    var_controller_url = f"https://{var_controller_ip}/v1/api"
    var_controller_password = os.environ.get('CONTROLLER_PASS','secretPASS$123')
    var_azure_account = "AZURE-PK"
    var_azure_region = "West Europe"

    CID = ""
    response = ""
    payload = ""

    # get Azure Token and Credentials -> for future operations in case EVER needed
    # Parse the alarm fields after turning them into a JSON structure
    subscription_id = os.environ.get('AZURE_SUBSCRIPTION_ID', '11111111-1111-1111-1111-111111111111')
    credential = DefaultAzureCredential()
    logging.info('Python HTTP trigger function processed a request.')
    req_body_bytes = req.get_body()
    logging.info(f"Request Bytes: {req_body_bytes}")
    req_body = req_body_bytes.decode("utf-8")
    logging.info(f"Request after decoding: {req_body}")
    parser = json.loads(req_body)
    logging.info(f"parser extract gateways"+str(parser["gateways"]))
    requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)
    payload={'action': 'login','username': 'admin', 'password': var_controller_password}
    response = requests.post(var_controller_url, headers=var_header, data=payload, verify=False)
    CID=json.loads(response.text)['CID']

    if (var_tracked_gw in parser["gateways"]):
        # Login to Aviatrix Controller and get CID to use in future requests as a TOKEN

        logging.info(f"Controller login happens with response: {response.text}")
        if (var_tracked_gw in parser["gateways"]) & (var_tracked_gw in parser["affectedGW"]) & (parser["alert"] == "OPEN"):
            logging.info(f"Gateway is DOWN, I will create the peering for you my lord")
            # detach SPOKE from Transit 
            logging.info(f"Detaching DOWN Spoke from Transit")
            payload={'action': 'detach_spoke_from_transit_gw', 'CID': CID, 'spoke_gw': var_tracked_gw, 'transit_gw': var_transit_gw}
            response = requests.post(var_controller_url, headers=var_header, data=payload, verify=False)
            logging.info(f"Result of Detaching Spoke from Transit: {response.text}")

            # Attach ARM Native Spoke
            logging.info(f"Attaching Spoke Natively via Aviatrix Controller. Traffic must flow")
            payload={'action': 'attach_arm_native_spoke_to_transit', 'CID': CID, 'transit_gateway_name': var_transit_gw,'account_name': var_azure_account, 'region': var_azure_region, 'vpc_id': f'{var_tracked_gw}:{var_tracked_gw_RG}:{var_tracked_gw_RGid}'}
            response = requests.post(var_controller_url, headers=var_header, data=payload, verify=False)
            logging.info(f"Native Spoke attached. Traffic should work again. Result is {response.text}")
        else:
            logging.info(f"bad luck sucker, no change gonna happen")
    elif (var_tracked_gw in parser["recoveredGW"]) & (parser["alert"] == "CLOSED"):
        logging.info(f"gateway recovered, I am destroying any peering and restoring traffic via Aviatrix as you wish")
        # Detach ARM Native Spoke 
        logging.info(f"Removing NATIVE attachment. Prepare to reattach working AVX GW")
        payload={'action': 'detach_arm_native_spoke_to_transit', 'CID': CID, 'transit_gateway_name': var_transit_gw, 'spoke_name': f'{var_azure_account}:{var_tracked_gw}:{var_tracked_gw_RG}:{var_tracked_gw_RGid}'}
        response = requests.post(var_controller_url, headers=var_header, data=payload, verify=False)
        logging.info(f"Native attachment REMOVED. Result is {response.text}")

        # Attach Azure Spoke GW 
        logging.info(f"Attaching Aviatrix GW now that it is UP")
        payload={'action': 'attach_spoke_to_transit_gw', 'CID': CID, 'spoke_gw': var_tracked_gw, 'transit_gw': var_transit_gw}
        response = requests.post(var_controller_url, headers=var_header, data=payload, verify=False)
        logging.info(f"Aviatrix Spoke connected now to Transit. Result is {response.text}")
    else:
            logging.info(f"Gateway did not match")
    return func.HttpResponse(
             "Finished. Pass a name in the query string or in the request body for a personalized response.",
             status_code=200
        )

Contributors …

Big Thank you to Mihai Tănăsescu. We have been working together on that code / execution …

Also thanks to Tomasz Klimczyk (for inspiration) and the rest of the Team (Adam Stipkovits, Tyrone Philip, Ian Gyte) as we all worked on ideas about using webhooks and presented it to our customers.

2 thoughts on “Aviatrix Webhooks with Azure Function”

Leave a Reply

Your email address will not be published. Required fields are marked *