Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: remove dollar signs to allow for correct copy and paste #1

Open
wants to merge 100 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
c48d12c
Upgrade Kustomize to version 4.5.4
astefanutti May 31, 2023
1616ded
Use Kustomize to bundle MCAD CRDs
astefanutti May 31, 2023
f228dba
add new github action which tags and builds a new operator
KPostOffice May 24, 2023
be42693
use unauthenticated red hat registry for Docker file
KPostOffice Jun 1, 2023
15df44b
feat(api): Support custom InstaScale container image
astefanutti Jun 1, 2023
59e97a4
Remove generated OLM bundle Dockerfile
astefanutti Jun 7, 2023
08fa9ed
Use uniquely identifying labels selector for the operator deployment
astefanutti Jun 1, 2023
a71303b
Label resources following recommended k8s labels convention
astefanutti Jun 1, 2023
bb1ce80
Add test for custom InstaScale controller image support
astefanutti Jun 7, 2023
baa7738
Update operator to 0.0.4
anishasthana Jun 8, 2023
b3bdf17
Add astefanutti to OWNERS
anishasthana Jun 8, 2023
094529c
Fix failing InstaScale tests
anishasthana Jun 8, 2023
7952081
Add make directive for opening PR on OpenShift community operators
KPostOffice May 31, 2023
d3cd099
Update version in makefile to 0.0.4
anishasthana Jun 9, 2023
95a9935
add a sign to commit so that it passes git checkrun
KPostOffice Jun 9, 2023
d4c9ecd
Update GitHub actions to avoid using deprecated actions
sutaakar Jun 6, 2023
071f041
change the default versions to be clearly placeholders
KPostOffice Jun 9, 2023
c1b5fa2
update contributing guidelines with build info
KPostOffice Jun 9, 2023
cf16a07
remove patches which are generated during bundle build
KPostOffice Jun 9, 2023
360e01f
restore config directory after building bundle
KPostOffice Jun 9, 2023
3180604
Generate CodeFlare API client
astefanutti Jun 12, 2023
02e14b5
Add MCAD version variable
astefanutti Jun 14, 2023
73db60f
Extended support for referencing MCAD CRD resources
astefanutti Jun 16, 2023
5e76774
updating rbacs for instascale
Fiona-Waters Jun 15, 2023
c2b68fa
addressing feedback, specifying required secret
Fiona-Waters Jun 15, 2023
892d65d
adding clusterversions
Fiona-Waters Jun 19, 2023
29bd18e
updating test data
Fiona-Waters Jun 19, 2023
9a7746b
Update README.md with release steps
sutaakar Jun 1, 2023
cd0cd54
feat(api): Support custom MCAD container image
astefanutti Jun 23, 2023
504a1e9
Create GitHub action to automate CodeFlare operator release
sutaakar Jun 13, 2023
8e0c5a0
refactor: add liveness and readyness probes to instascale deployment …
dimakis Jun 27, 2023
6531fc5
perf: addition of liveness and readiness probes to mcad deployment te…
dimakis Jun 29, 2023
88cd476
fix: fixes the path for readiness probe
dimakis Jun 29, 2023
f7a3226
Initial e2e tests
astefanutti Jun 13, 2023
915d420
test: Submit MNIST RayJob
astefanutti Jun 19, 2023
aea8000
test: Fix RayCluster labels selector
astefanutti Jun 19, 2023
059af32
e2e: Print KubeRay operator logs
astefanutti Jun 19, 2023
fb4b7b4
test: Fix RayJob runtime environment
astefanutti Jun 19, 2023
922dd7a
test: Print RayJob logs
astefanutti Jun 20, 2023
d4a6cfe
test: Document how to run e2e tests locally
astefanutti Jun 21, 2023
ff609f9
test: Polish MNIST RayJob test
astefanutti Jun 21, 2023
2361ef5
test: Add MNIST training with MCAD Job
astefanutti Jun 22, 2023
1285e8e
test: Print MNIST batch job logs
astefanutti Jun 22, 2023
6553bfd
test: Use RayCluster 'complete' configuration
astefanutti Jun 22, 2023
0c4e2ae
test: Add step log statements
astefanutti Jun 22, 2023
ae870ea
test: Add defered troubleshooting logs
astefanutti Jun 22, 2023
4fb8a6f
test: Add MNIST training in RayCluster with CodeFlare SDK
astefanutti Jun 22, 2023
3c21143
test: Customize test timeouts
astefanutti Jun 22, 2023
fc86a75
test: Pass MNIST training with CodeFlare SDK on OpenShift
astefanutti Jun 26, 2023
f55e8c9
test: Print Job logs after successfull or failed completion
astefanutti Jun 27, 2023
24841ca
test: Re-use pip requirements file
astefanutti Jun 27, 2023
475137e
test: Parameterize CodeFlare SDK version
astefanutti Jun 27, 2023
f0960c1
test: Remove ray_lightning from requirements
astefanutti Jun 27, 2023
3fd6af6
test: Parameterize Ray image and version
astefanutti Jun 27, 2023
2d477f6
test: Parameterize PyTorch image
astefanutti Jun 27, 2023
57b427d
test: Add FIXME for SDK user base image
astefanutti Jun 27, 2023
9c79ab0
Align go.mod with MCAD version
astefanutti Jun 27, 2023
f7b7a3a
test: Print Ray job logs after successful or failed completion
astefanutti Jun 28, 2023
321a8aa
test: Upload job logs
astefanutti Jun 29, 2023
630199b
test Remove unused functions
astefanutti Jun 29, 2023
9771055
test: Fix Unexpected kind-action input
astefanutti Jun 30, 2023
06d47ed
test: Format test output using gotestfmt
astefanutti Jun 30, 2023
d665dcc
test: Add codeflare stack logs to uploaded artifacts
astefanutti Jun 30, 2023
9d53656
test: Add description to e2e tests
astefanutti Jun 30, 2023
99b8e7c
test: Factorize e2e tests setup
astefanutti Jun 30, 2023
4048a26
test: Update e2e tests local run documentation
astefanutti Jun 30, 2023
8c1831a
test: Write logs also for jobs that have timed out
astefanutti Jul 5, 2023
2fa3f90
doc: The operator should be started before setting up e2e tests
astefanutti Jul 5, 2023
e04fbfa
fix: Update MCAD version when installing CRDs
astefanutti Jun 27, 2023
92c616d
Use CodeFlare machine user to push PR into OpenShift community operat…
sutaakar Jul 7, 2023
644d27b
Update dependency versions for release v0.0.5
anishasthana Jul 7, 2023
598f32b
Fix mcad tag reference for CRDs
anishasthana Jul 9, 2023
cd43ff5
Adjust Makefile versions before building operator image
sutaakar Jul 10, 2023
1380bcc
updating rbacs
Fiona-Waters Jul 10, 2023
d376f12
Update dependency versions for release v0.0.6
anishasthana Jul 10, 2023
70f0719
Run go mod tidy before comitting files in release workflow
sutaakar Jul 11, 2023
9c269d0
OLM install and upgrade PR check for CodeFlare stack
sutaakar Jun 15, 2023
0a84b1d
Expose CodeFlare Operator metrics endpoint
ChristianZaccaria Jul 18, 2023
1ff1120
Add Ray cluster support in test support package
sutaakar Jul 24, 2023
c885e0a
Update: remove duplicated resource and should have lower in order bei…
zdtsw Jul 23, 2023
c9929bd
Remove waiting loops for kubernetes resources from Makefile
sutaakar Jul 25, 2023
a822747
Add Ray cluster REST API support in test support package
sutaakar Jul 27, 2023
3f5e55d
Add Make Target for organizing go imports
anishasthana Jul 27, 2023
9b5edf8
Organize go imports
anishasthana Jul 27, 2023
234cb41
Add verify-imports github action
anishasthana Jul 27, 2023
223dffe
Remove revive check
anishasthana Jul 28, 2023
ef43fee
Organize go imports for generated files
anishasthana Jul 28, 2023
817f97e
Release workflow - open a PR with changes instead of direct push
sutaakar Jul 28, 2023
9eeacae
Fix: add missing --output-base option to applyconfiguration-gen
astefanutti Jul 31, 2023
47bc1ff
client: Add missing ControllerImage field to apply configuration
astefanutti Jul 31, 2023
ca3baf6
Create GitHub action to automate CodeFlare project release
sutaakar Jun 27, 2023
c2d2f88
Remove kube-rbac-proxy and open metrics endpoint
ChristianZaccaria Jul 24, 2023
8bf0aab
Create consolidated action for verification of auto-generated / forma…
anishasthana Jul 31, 2023
226a560
Bump github.com/onsi/ginkgo/v2 from 2.9.2 to 2.11.0
dependabot[bot] Aug 3, 2023
8a9d684
Upgrade MCAD to version 1.33.0
astefanutti Aug 8, 2023
cd8b115
Update dependency versions for release v0.1.0
codeflare-machine-account Aug 8, 2023
81e8f81
e2e: Use Ray dashboard API to retrieve job logs
astefanutti Aug 3, 2023
bc4b46a
e2e: Add kind-e2e target to setup test KinD cluster
astefanutti Aug 3, 2023
d987823
Small Make target bug-fix
ChristianZaccaria Aug 9, 2023
28726c3
docs: remove dollar signs to allow for correct copy and paste
dimakis Aug 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions .github/actions/kind/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: "Set up KinD"
description: "Step to start and configure KinD cluster"

runs:
using: "composite"
steps:
- name: Init directories
shell: bash
run: |
TEMP_DIR="$(pwd)/tmp"
mkdir -p "${TEMP_DIR}"
echo "TEMP_DIR=${TEMP_DIR}" >> $GITHUB_ENV

mkdir -p "$(pwd)/bin"
echo "$(pwd)/bin" >> $GITHUB_PATH

- name: Container image registry
shell: bash
run: |
podman run -d -p 5000:5000 --name registry registry:2.8.1

export REGISTRY_ADDRESS=$(hostname -i):5000
echo "REGISTRY_ADDRESS=${REGISTRY_ADDRESS}" >> $GITHUB_ENV
echo "Container image registry started at ${REGISTRY_ADDRESS}"

KIND_CONFIG_FILE=${{ env.TEMP_DIR }}/kind.yaml
echo "KIND_CONFIG_FILE=${KIND_CONFIG_FILE}" >> $GITHUB_ENV
envsubst < .github/resources-kind/kind.yaml > ${KIND_CONFIG_FILE}

sudo --preserve-env=REGISTRY_ADDRESS sh -c 'cat > /etc/containers/registries.conf.d/local.conf <<EOF
[[registry]]
prefix = "$REGISTRY_ADDRESS"
insecure = true
location = "$REGISTRY_ADDRESS"
EOF'

- name: Setup KinD cluster
uses: helm/[email protected]
with:
cluster_name: cluster
version: v0.17.0
config: ${{ env.KIND_CONFIG_FILE }}

- name: Print cluster info
shell: bash
run: |
echo "KinD cluster:"
kubectl cluster-info
kubectl describe nodes

- name: Install Ingress controller
shell: bash
run: |
VERSION=controller-v1.6.4
echo "Deploying Ingress controller into KinD cluster"
curl https://raw.githubusercontent.com/kubernetes/ingress-nginx/"${VERSION}"/deploy/static/provider/kind/deploy.yaml | sed "s/--publish-status-address=localhost/--report-node-internal-ip-address\\n - --status-update-interval=10/g" | kubectl apply -f -
kubectl annotate ingressclass nginx "ingressclass.kubernetes.io/is-default-class=true"
kubectl -n ingress-nginx wait --timeout=300s --for=condition=Available deployments --all
31 changes: 31 additions & 0 deletions .github/resources-kind/kind.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# ---------------------------------------------------------------------------
# Copyright 2023.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ---------------------------------------------------------------------------

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:v1.25.3@sha256:f52781bc0d7a19fb6c405c2af83abfeb311f130707a0e219175677e366cc45d1
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."${REGISTRY_ADDRESS}"]
endpoint = ["http://${REGISTRY_ADDRESS}"]
12 changes: 12 additions & 0 deletions .github/resources-olm-upgrade/catalogsource.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: codeflare-olm-test
namespace: olm
spec:
displayName: ''
grpcPodConfig:
securityContextConfig: restricted
image: "${CATALOG_BASE_IMG}"
publisher: ''
sourceType: grpc
5 changes: 5 additions & 0 deletions .github/resources-olm-upgrade/operatorgroup.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-operators
namespace: openshift-operators
11 changes: 11 additions & 0 deletions .github/resources-olm-upgrade/subscription.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: codeflare-operator
namespace: openshift-operators
spec:
channel: alpha
installPlanApproval: Automatic
name: codeflare-operator
source: codeflare-olm-test
sourceNamespace: olm
119 changes: 119 additions & 0 deletions .github/workflows/e2e_tests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
name: e2e

on:
pull_request:
branches:
- main
- 'release-*'
paths-ignore:
- 'docs/**'
- '**.adoc'
- '**.md'
- 'LICENSE'
push:
branches:
- main
- 'release-*'
paths-ignore:
- 'docs/**'
- '**.adoc'
- '**.md'
- 'LICENSE'

concurrency:
group: ${{ github.head_ref }}-${{ github.workflow }}
cancel-in-progress: true

jobs:
kubernetes:

runs-on: ubuntu-20.04

steps:
- name: Cleanup
run: |
ls -lart
echo "Initial status:"
df -h

echo "Cleaning up resources:"
sudo swapoff -a
sudo rm -f /swapfile
sudo apt clean
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
docker rmi $(docker image ls -aq)

echo "Final status:"
df -h

- name: Checkout code
uses: actions/checkout@v3
with:
submodules: recursive

- name: Set Go
uses: actions/setup-go@v3
with:
go-version: v1.18

- name: Set up gotestfmt
uses: gotesttools/gotestfmt-action@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}

- name: Setup and start KinD cluster
uses: ./.github/actions/kind

- name: Deploy CodeFlare stack
id: deploy
run: |
echo Deploying CodeFlare operator
IMG="${REGISTRY_ADDRESS}"/codeflare-operator
make image-push -e IMG="${IMG}"
make deploy -e IMG="${IMG}"
kubectl wait --timeout=120s --for=condition=Available=true deployment -n openshift-operators codeflare-operator-manager

echo Setting up CodeFlare stack
make setup-e2e

- name: Run e2e tests
run: |
export CODEFLARE_TEST_TIMEOUT_SHORT=1m
export CODEFLARE_TEST_TIMEOUT_MEDIUM=5m
export CODEFLARE_TEST_TIMEOUT_LONG=10m

export CODEFLARE_TEST_OUTPUT_DIR=${{ env.TEMP_DIR }}
echo "CODEFLARE_TEST_OUTPUT_DIR=${CODEFLARE_TEST_OUTPUT_DIR}" >> $GITHUB_ENV

set -euo pipefail
go test -timeout 30m -v ./test/e2e -json 2>&1 | tee ${CODEFLARE_TEST_OUTPUT_DIR}/gotest.log | gotestfmt

- name: Print CodeFlare operator logs
if: always() && steps.deploy.outcome == 'success'
run: |
echo "Printing CodeFlare operator logs"
kubectl logs -n openshift-operators --tail -1 -l app.kubernetes.io/name=codeflare-operator | tee ${CODEFLARE_TEST_OUTPUT_DIR}/codeflare-operator.log

- name: Print MCAD controller logs
if: always() && steps.deploy.outcome == 'success'
run: |
echo "Printing MCAD controller logs"
kubectl logs -n codeflare-system --tail -1 -l component=multi-cluster-application-dispatcher | tee ${CODEFLARE_TEST_OUTPUT_DIR}/mcad.log

- name: Print KubeRay operator logs
if: always() && steps.deploy.outcome == 'success'
run: |
echo "Printing KubeRay operator logs"
kubectl logs -n ray-system --tail -1 -l app.kubernetes.io/name=kuberay | tee ${CODEFLARE_TEST_OUTPUT_DIR}/kuberay.log

- name: Upload logs
uses: actions/upload-artifact@v3
if: always() && steps.deploy.outcome == 'success'
with:
name: logs
retention-days: 10
path: |
${{ env.CODEFLARE_TEST_OUTPUT_DIR }}/**/*.log
Loading