6.4 Best Practices


This page provides recommendations for optimal use of the Australian Dataspace Testbed Platform, covering security, performance, cost management, and development workflows.

Security Best Practices

IP Address Whitelisting

  • Always restrict security group access to specific IP addresses or CIDR ranges
  • Never use 0.0.0.0/0 (any IP) except for port 443
  • Update security groups when changing networks or locations
  • Use /32 CIDR notation for single IP addresses (e.g., 1.2.3.4/32)

SSH Key Management

  • Store private keys securely with appropriate file permissions (chmod 400)
  • Never share private keys or commit them to version control
  • Use separate key pairs for different projects or environments
  • Use the platform to rotate keys (delete/regenerate) periodically for long-running projects

Certificate Handling

  • Accept self-signed certificate warnings only for your own dataspace instances
  • Verify the IP address matches your instance before accepting certificates

Credential Management

  • Store RDA API keys and other credentials securely
  • Contact RDA immediately if you believe your key may be compromised

Access Control

  • Use institutional credentials; never share credentials
  • Log out when using shared or public computers

Performance Optimisation

Instance Sizing

  • Start with recommended minimum sizes:
    • AWS: m7i.xlarge (16GB RAM) or larger
    • Azure: Standard_D4s_v3 (16GB RAM) or larger
    • Nectar: m3.medium (16GB RAM) or larger
  • For large datasets, use memory-optimised instances:
    • AWS: r7i series
    • Azure: Standard_E series
    • Nectar: r3 series
  • Monitor resource usage and scale up if experiencing performance issues

Resource Monitoring

  • Regularly check CPU and memory usage via SSH:
    top
    htop
    free -h
    df -h
  • Monitor Docker container resource consumption:
    sudo docker stats
  • Watch for memory pressure or high CPU usage indicating need for larger instance

Docker Management

  • Restart containers if they become unresponsive:
    sudo docker compose restart
  • Monitor container logs for errors:
    sudo docker compose logs -f

Network Performance

  • Choose cloud regions closest to your location for better latency
  • For large data transfers, consider using instance types with enhanced networking

Cost Management

Instance Lifecycle

  • Stop dataspaces when not actively in use (evenings, weekends)
  • Delete dataspaces that are no longer needed
  • Avoid leaving instances running overnight for development work
  • Schedule regular reviews of active dataspaces

Right-Sizing

  • Don’t over-provision resources; start with minimum recommended sizes
  • Scale up only when performance issues are observed
  • Use smaller instances for development and testing
  • Reserve larger instances for large datasets or demonstrations

Credit Monitoring

  • Check credit usage regularly through the platform dashboard
  • Set reminders to review costs weekly
  • Plan resource usage to stay within project allocation

Cost-Effective Practices

  • Terminate failed or misconfigured instances promptly
  • Document instance purpose (use the description field) to avoid keeping unnecessary resources

Troubleshooting Workflow

Systematic Approach

  1. Identify the specific component or service failing
  2. Check relevant logs for error messages
  3. Verify network connectivity and security groups
  4. Confirm services are running
  5. Test with minimal configuration
  6. Gradually add complexity

Log Locations

  • Bootstrap logs: /var/log/cloud-init-output.log
  • Docker logs: sudo docker compose logs [service-name]
  • Application logs: Check within Docker containers

Before Seeking Support

  • Attempt basic troubleshooting steps
  • Gather relevant error messages and logs
  • Document steps to reproduce the issue
  • Note what troubleshooting has already been attempted
  • Check FAQ and known issues documentation
  • Check if the issue can be repeated on a different dataspace deployment