Introduction
Models are the core AI artifacts that encapsulate a trained machine‑learning algorithm together with its weights, metadata, and runtime requirements. In Edge Inference Service, a model lives inside an Organization (the repository that groups related models) and can be imported, versioned, and benchmarked for edge deployment. You use a model when you need to run inference directly on an edge device. For example, detecting objects in a video stream, classifying sensor data, or executing a custom prediction service.
The main difference between a model and an agent follows. A model is the AI artifact you train, while an agent is the full, deployable package that tells Edge AI Flows how to run that model on the edge. The agent references a model (by name and version) and adds all the surrounding configuration needed for a production‑grade deployment.
Prerequisites
For Edge Inference Service users:
- User account with appropriate role
- Access to at least one organization with a Model
- API key for programmatic access (optional)
For model upload via Jupyter:
- Python 3.7+ environment
- ZEDEDA Edge AI Client installed
- Model trained and ready for deployment
For deployment:
- Target edge node onboarded in ZEDEDA Cloud
- Edge node running a ZEDEDA Edge Kubernetes Service Cluster
- Network connectivity between Edge Inference Service, ZEDEDA Cloud platform, and edge node to trigger the initial deployment
View Models
Explore your machine learning models.
- Log into the Edge Inference Service console.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- From here you can do the following:
- Select a model card to view its technical details, such as the supported framework (for example, TensorFlow or PyTorch) and version history.
- Filter by model stage, such as: Production, Staging, Archived, Development.
- Filter by tags, such as: Job Id, Import Timestamp, Imported From, Library, Original Identifier, Pipeline Tag, Source
- Use the filter icon to search the models by keyword.
View Model Details
View the complete information about model versions, metrics, and specifications.
- Log into the Edge Inference Service console.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- Scroll, search, or filter to find the model.
- Select a card.
- View the details:
Metadata:
- Model name, version, and description
- Creation and last update timestamps
- Creator/owner information
- Model stage and status
- Tags for categorization
Metrics:
- Training accuracy, loss, or custom metrics
- Validation performance
- Comparison with previous versions
- Performance trends over time
Technical Specifications:
- ML framework and version (TensorFlow, PyTorch, XGBoost, etc.)
- Model input shapes and data types
- Model output shapes and formats
- Model size and architecture details
- Runtime dependencies and requirements
Artifacts:
- Model files (.pb, .pth, .onnx, etc.)
- Documentation files (README, model cards)
- Test datasets for validation
- Configuration files
- Preprocessing/postprocessing scripts
Deployment History:
- Current deployments and their status
- Historical deployment records
- Deployment configurations used
- Performance in production
Deploy a Model
Deploying a model means taking a trained AI model that already exists in an Organization and packaging it into a Helm chart that can run on a ZEDEDA Edge Kubernetes Service cluster managed by ZEDEDA Cloud. You can target a single cluster or a cluster group (multiple clusters at once).
- Log into the Edge Inference Service console.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- Scroll, search, or filter to find the model.
- Select a model card to view its technical details.
- Click Deploy.
- Add a new Deployment Submission:
- Basic Info:
- Enter a unique Name.
- Enter a Title.
- (Optional) Enter a Description.
- Model Configuration
- Select a Model from the dropdown list.
- Select a Model Version from the dropdown list.
- Select a Business App Image from the dropdown list.
- zededa/edgeai-demo-app:latest is pre-registered in the platform
- You can supply a fully qualified image name (such as docker.io/myorg/myapp:v2) and the platform will use that image in the generated Helm chart.
- Target Configuration
The platform uses the ZEDEDA Edge Orchestrator API Token from your profile to fetch the list of clusters and cluster groups that belong to your ZEDEDA Edge Orchestrator Host, so that you do not need to enter them manually.- Select Single Cluster or Cluster Group from the dropdown list.
- Select a Target cluster or group from the dropdown list.
- The Architecture and Project Name are autopopulated based on the previous selections.
- Basic Info:
- Click Create Deployment to create a Deployment request.
- See specifically (Optional) Configure Cluster-Specific Overrides via GitOps if you are deploying a single Model or Agent to multiple sites that require different model versions or configurations.
Deployment Strategy
Development and Testing:
- Use Quick Deploy for rapid iteration
- Test on single device first
- Validate inference endpoint works correctly
- Check resource usage and performance
Pre-Production:
- Use benchmarking to validate performance on real hardware
- Test multiple model versions to find best performer
- Verify model meets latency and accuracy requirements
- Test with realistic data and load
Production Deployment:
- Use GitOps workflow for audit trail
- Deploy to small cluster group first (canary)
- Monitor performance for 24-48 hours
- Gradually roll out to full fleet
- Have rollback plan ready
Post-Deployment:
- Enable monitoring and logging
- Set up alerts for errors or performance degradation
- Review metrics regularly
- Plan for model updates and retraining
Download a Model
Download model files for local testing or offline deployment scenarios. You can download an entire model version with all artifacts.
- Log into the Edge Inference Service console.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- Scroll, search, or filter to find the model.
- Select a model card.
- Click Download.
Copy a Model Between Organizations
Share models across teams, promote models from dev to production organizations, or maintain model provenance during copy.
- Log into the Edge Inference Service console.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- Scroll, search, or filter to find the model.
- Select a model card.
- Click the Copy Model icon.
- Enter a New Model Name.
- Select a Model Version from the dropdown list.
- Select a Destination Organization from the dropdown list.
- Click Copy.
Upload a Model
Upload a model through the console when using pre-trained models from files, adding externally trained models, or for quick uploads without ZEDEDA Jupyter Client setup.
- Log into the Edge Inference Service console.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- Click Upload Model.
- Select files by drag and drop or browse.
- Enter model details:
- Enter a unique Model Name.
- Select a Target Organization from the dropdown list.
- (Optional) Enter a Description.
- Add Tags.
A tag is a key‑value pair that you attach to a model.Tags are used for:
| Purpose | How it helps |
| Filtering | Find all models that share a common attribute (for example, environment: prod, framework: onnx). |
| Grouping | Organize models by project, team, or use‑case. |
| Search | Tags appear in the organization search UI, making it easier to locate a model. |
| Automation | Some deployment or benchmarking workflows can be configured to act on models that carry specific tags. |
- Click Upload.
Or you could upload a model Using Jupyter Notebook Integration to programmatically integrate model management directly into your workflows instead.
Upload Artifacts
Attach additional files to a model version, such as: Documentation (README.md, model cards, usage guides), test datasets (sample inputs, validation data), configuration files (deployment configs, parameter files), or supplementary scripts (preprocessing, postprocessing).
- Log into the Edge Inference Service console.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- Scroll, search, or filter to find the model.
- Click Upload Artifacts.
- Fill in the fields and attach files.
- Click Upload.
Import a Model
After you add a source, you can import 3rd-party models from external ML repositories and models. Importing brings the model from the third-party service and stores it in the ZEDEDA Edge Inference Service. Storing it in the platform enables you to reuse a model that already exists in your organization rather than reaching out to an external service every time. This is useful if you have network latency issues, security concerns, air-gapped environment, and more.
- Log into the Edge Inference Service console.
- Add a source.
- Navigate to Model Hub > Models in the left panel to view the available AI models.
- Click Import Model.
- Select a Source from the list of providers you already added.
- Fill in the fields.
- Click Create Job.
Model Stages
The stages include the following:
- Development - Model under active development
- Staging - Model ready for testing
- Production - Model approved for production use
- Archived - Deprecated or retired model
Search or Filter for a Model
Search tips:
- Use autocomplete for popular tags and authors.
- Sort by Updated At to find recently modified models.
- Save common search filters for quick access.
- Filter by framework when targeting specific inference engines.
Text Search
Search: "customer churn prediction"
→ Matches model names, descriptions, and tagsFilter by Tags
Tags: classification, xgboost, production
→ Shows only models with all specified tagsFilter by Status
Status: Production
→ Shows only production-approved modelsFilter by Author
Author: data-science-team
→ Shows models created by specific user or teamCombine Filters
Search: "anomaly detection"
Tags: tensorflow
Status: Production, Staging
Author: ml-engineer@company.com
→ Highly targeted resultsModel Versioning Best Practices
Version Naming:
- No custom versioning available. Model version is auto incremented: v1.0.0, v1.1.0, v2.0.0
- Include date for time-based versioning: v20250113
- Be consistent across models in the same project
Model Metadata:
- Tag models with framework and version: tensorflow-2.12
- Tag with use case: classification, detection, segmentation
- Tag with training date: trained-2025-01
- Tag with team or owner: data-science-team
Documentation:
- Attach model cards with:
- Model purpose and use cases
- Training data description
- Performance metrics (accuracy, latency)
- Known limitations
- Example usage
- Include test datasets as artifacts
- Provide preprocessing/postprocessing code