Manage Models in ZEDEDA Edge Inference Service Using the console

Introduction

Models are the core AI artifacts that encapsulate a trained machine‑learning algorithm together with its weights, metadata, and runtime requirements. In Edge Inference Service, a model lives inside an Organization (the repository that groups related models) and can be imported, versioned, and benchmarked for edge deployment. You use a model when you need to run inference directly on an edge device. For example, detecting objects in a video stream, classifying sensor data, or executing a custom prediction service. 

The main difference between a model and an agent follows. A model is the AI artifact you train, while an agent is the full, deployable package that tells Edge AI Flows how to run that model on the edge. The agent references a model (by name and version) and adds all the surrounding configuration needed for a production‑grade deployment.

Prerequisites

For Edge Inference Service users:

  • User account with appropriate role 
  • Access to at least one organization with a Model
  • API key for programmatic access (optional)

For model upload via Jupyter:

  • Python 3.7+ environment
  • ZEDEDA Edge AI Client installed
  • Model trained and ready for deployment

For deployment:

View Models

Explore your machine learning models. 

  1. Log into the Edge Inference Service console.
  2. Navigate to Model Hub > Models in the left panel to view the available AI models.
  3. From here you can do the following:
    • Select a model card to view its technical details, such as the supported framework (for example, TensorFlow or PyTorch) and version history.
    • Filter by model stage, such as: Production, Staging, Archived, Development.
    • Filter by tags, such as: Job Id, Import Timestamp, Imported From, Library, Original Identifier, Pipeline Tag, Source
    • Use the filter icon to search the models by keyword.

View Model Details

View the complete information about model versions, metrics, and specifications. 

  1. Log into the Edge Inference Service console.
  2. Navigate to Model Hub > Models in the left panel to view the available AI models.
  3. Scroll, search, or filter to find the model.
  4. Select a card.
  5. View the details:

Metadata:

  • Model name, version, and description
  • Creation and last update timestamps
  • Creator/owner information
  • Model stage and status
  • Tags for categorization

Metrics:

  • Training accuracy, loss, or custom metrics
  • Validation performance
  • Comparison with previous versions
  • Performance trends over time

Technical Specifications:

  • ML framework and version (TensorFlow, PyTorch, XGBoost, etc.)
  • Model input shapes and data types
  • Model output shapes and formats
  • Model size and architecture details
  • Runtime dependencies and requirements

Artifacts:

  • Model files (.pb, .pth, .onnx, etc.)
  • Documentation files (README, model cards)
  • Test datasets for validation
  • Configuration files
  • Preprocessing/postprocessing scripts

Deployment History:

  • Current deployments and their status
  • Historical deployment records
  • Deployment configurations used
  • Performance in production

Deploy a Model

Deploying a model means taking a trained AI model that already exists in an Organization and packaging it into a Helm chart that can run on a ZEDEDA Edge Kubernetes Service cluster managed by ZEDEDA Cloud. You can target a single cluster or a cluster group (multiple clusters at once). 

  1. Log into the Edge Inference Service console.
  2. Navigate to Model Hub > Models in the left panel to view the available AI models.
  3. Scroll, search, or filter to find the model.
  4. Select a model card to view its technical details.
  5. Click Deploy.
  6. Add a new Deployment Submission:
    • Basic Info:
      • Enter a unique Name.
      • Enter a Title.
      • (Optional) Enter a Description.
    • Model Configuration
      • Select a Model from the dropdown list. 
      • Select a Model Version from the dropdown list.
      • Select a Business App Image from the dropdown list.
        • zededa/edgeai-demo-app:latest is pre-registered in the platform
        • You can supply a fully qualified image name (such as docker.io/myorg/myapp:v2) and the platform will use that image in the generated Helm chart. 
    • Target Configuration
      The platform uses the ZEDEDA Edge Orchestrator API Token from your profile to fetch the list of clusters and cluster groups that belong to your ZEDEDA Edge Orchestrator Host, so that you do not need to enter them manually.
      • Select Single Cluster or Cluster Group from the dropdown list.
      • Select a Target cluster or group from the dropdown list.
      • The Architecture and Project Name are autopopulated based on the previous selections.
  7. Click Create Deployment to create a Deployment request. 

Deployment Strategy

Development and Testing:

  • Use Quick Deploy for rapid iteration
  • Test on single device first
  • Validate inference endpoint works correctly
  • Check resource usage and performance

Pre-Production:

  • Use benchmarking to validate performance on real hardware
  • Test multiple model versions to find best performer
  • Verify model meets latency and accuracy requirements
  • Test with realistic data and load

Production Deployment:

  • Use GitOps workflow for audit trail
  • Deploy to small cluster group first (canary)
  • Monitor performance for 24-48 hours
  • Gradually roll out to full fleet
  • Have rollback plan ready

Post-Deployment:

  • Enable monitoring and logging
  • Set up alerts for errors or performance degradation
  • Review metrics regularly
  • Plan for model updates and retraining

Download a Model

Download model files for local testing or offline deployment scenarios. You can download an entire model version with all artifacts. 

  1. Log into the Edge Inference Service console.
  2. Navigate to Model Hub > Models in the left panel to view the available AI models.
  3. Scroll, search, or filter to find the model.
  4. Select a model card.
  5. Click Download.

Copy a Model Between Organizations

Share models across teams, promote models from dev to production organizations, or maintain model provenance during copy. 

  1. Log into the Edge Inference Service console.
  2. Navigate to Model Hub > Models in the left panel to view the available AI models.
  3. Scroll, search, or filter to find the model.
  4. Select a model card.
  5. Click the Copy Model icon.
  6. Enter a New Model Name.
  7. Select a Model Version from the dropdown list.
  8. Select a Destination Organization from the dropdown list.
  9. Click Copy.

Upload a Model

Upload a model through the console when using pre-trained models from files, adding externally trained models, or for quick uploads without ZEDEDA Jupyter Client setup.

  1. Log into the Edge Inference Service console.
  2. Navigate to Model Hub > Models in the left panel to view the available AI models.
  3. Click Upload Model.
  4. Select files by drag and drop or browse.
  5. Enter model details:
    1. Enter a unique Model Name.
    2. Select a Target Organization from the dropdown list.
    3. (Optional) Enter a Description.
    4. Add Tags
      A tag is a key‑value pair that you attach to a model.Tags are used for:

       
Purpose How it helps
Filtering Find all models that share a common attribute (for example, environment: prod, framework: onnx).
Grouping Organize models by project, team, or use‑case.
Search Tags appear in the organization search UI, making it easier to locate a model.
Automation Some deployment or benchmarking workflows can be configured to act on models that carry specific tags.

 

  1. Click Upload.

Or you could upload a model Using Jupyter Notebook Integration to programmatically integrate model management directly into your workflows instead. 

Upload Artifacts

Attach additional files to a model version, such as: Documentation (README.md, model cards, usage guides), test datasets (sample inputs, validation data), configuration files (deployment configs, parameter files), or supplementary scripts (preprocessing, postprocessing). 

  1. Log into the Edge Inference Service console.
  2. Navigate to Model Hub > Models in the left panel to view the available AI models.
  3. Scroll, search, or filter to find the model.
  4. Click Upload Artifacts.
  5. Fill in the fields and attach files.
  6. Click Upload

Import a Model

After you add a source, you can import 3rd-party models from external ML repositories and models. Importing brings the model from the third-party service and stores it in the ZEDEDA Edge Inference Service. Storing it in the platform enables you to reuse a model that already exists in your organization rather than reaching out to an external service every time. This is useful if you have network latency issues, security concerns, air-gapped environment, and more.

  1. Log into the Edge Inference Service console.
  2. Add a source.
  3. Navigate to Model Hub > Models in the left panel to view the available AI models.
  4. Click Import Model.
  5. Select a Source from the list of providers you already added.
  6. Fill in the fields.
  7. Click Create Job

Model Stages

The stages include the following:

  • Development - Model under active development
  • Staging - Model ready for testing
  • Production - Model approved for production use
  • Archived - Deprecated or retired model

Search or Filter for a Model

Search tips:

  • Use autocomplete for popular tags and authors.
  • Sort by Updated At to find recently modified models.
  • Save common search filters for quick access.
  • Filter by framework when targeting specific inference engines.

Text Search

Search: "customer churn prediction"
→ Matches model names, descriptions, and tags

Filter by Tags

Tags: classification, xgboost, production
→ Shows only models with all specified tags

Filter by Status

Status: Production
→ Shows only production-approved models

Filter by Author

Author: data-science-team
→ Shows models created by specific user or team

Combine Filters

Search: "anomaly detection"
Tags: tensorflow
Status: Production, Staging
Author: ml-engineer@company.com
→ Highly targeted results

Model Versioning Best Practices

Version Naming:

  • No custom versioning available. Model version is auto incremented: v1.0.0, v1.1.0, v2.0.0
  • Include date for time-based versioning: v20250113
  • Be consistent across models in the same project

Model Metadata:

  • Tag models with framework and version: tensorflow-2.12
  • Tag with use case: classification, detection, segmentation
  • Tag with training date: trained-2025-01
  • Tag with team or owner: data-science-team

Documentation:

  • Attach model cards with:
    • Model purpose and use cases
    • Training data description
    • Performance metrics (accuracy, latency)
    • Known limitations
    • Example usage
  • Include test datasets as artifacts
  • Provide preprocessing/postprocessing code

Next Steps

Was this article helpful?
0 out of 0 found this helpful