AI/LLM Exploitation Challenges

lab Challenge series

Challenges designed to test your knowledge of AI, ML, and LLMs, along with practical exploitation techniques.

Write your awesome label here.

Explore the Challenges

Welcome to the Ultimate Essay Grader. This is an intelligent essay grading application powered by AI that provides instant feedback and comprehensive evaluation for student essays. It supports Automated Grading where you get instant AI-powered grades from A-F and numerical scores from 0-100, for multiple essay types. It supports for Argumentative Essays, Narrative Essays, Research Papers, and Literary Analysis, and gives comprehensive feedback including strengths, areas for improvement, and grammar assessment. It allows uploading essays in .docx or .doc format.

You are a RedTeam operator hired to perform a security audit on the system. You've discovered a secure vault system that uses an ML model with to authenticate users. The model analyzes uploaded images of handwritten digits and only grants access when it predicts the digit as "9" using large amount of MNIST dataset. Use the model.h5 for reference. Your goal is to identify the vulnerability, and craft an adversarial example that fools the model to bypass authentication and unlock the vault to retrieve the flag. Share the script and image that works.

You are a Red Team Operator hired to demonstrate why this is a catastrophic idea. You have identified that you can run python script in the application folder that can update the production CSV file. You cannot modify the Python code running the server, but you can write to the CSV file. Your goal is to plant a "Backdoor" in the AI model.

You have to provide a python script to Poison the data so that a clearly malicious email is marked as SAFE if and only if it contains the secret trigger word SystemOverrideAlpha.

Sarah, the AI Security Lead at TechVision Corp, has assigned you the role of AI Security Analyst to assess the security of machine learning models supplied by an external vendor for an autonomous vehicle system.

You have been provided with five pre-trained CIFAR-10 image classification models. Although the vendor claims all models meet accuracy requirements, recent intelligence suggests their infrastructure may have been compromised. Sarah suspects that one of the models contains a neural network backdoor, a hidden trigger that causes targeted misclassification when a specific visual pattern is present in an input image.

Maya, the Chief Security Officer at DataFlow Industries, has assigned you the role of ML Security Analyst to conduct a critical supply chain security audit. The company's AI team recently downloaded five pre-trained models from a popular community repository for deployment in a production fraud detection system.

You have been provided with five models: community_classifier, advanced_classifier, bert_finetuned, safe_classifier, and high_accuracy. Although the vendor claims all models are safe and meet performance standards, recent threat intelligence indicates that the community repository may have been compromised by a sophisticated supply chain attack.

MCP Skill Marketplace is a growing ecosystem of AI extensions that help developers automate tasks, manage code, and boost productivity. The marketplace hosts hundreds of third-party skills including code formatters, git assistants, and documentation generators. Developers say these tools have the best security possible, but that's what everyone says right?

You've been given three popular MCP skills to evaluate before your team installs them:

1. Code Formatter Pro - Formats Python and JavaScript code

2. Git Workflow Assistant - Automates git operations and commit messages

3. API Documentation Generator - Creates OpenAPI/Swagger documentation

A healthcare company called MedSecure AI trained a diagnosis model on 500 real patient records. To comply with privacy regulations (HIPAA/GDPR), they "deleted" the training data afterward. The problem: The model memorized patient information through overfitting. The data isn't truly deleted - it's encoded in the model weights!

This vulnerability leads to serious privacy violations:

GDPR "Right to be Forgotten" - Data wasn't actually deleted

HIPAA breach - Can identify whose medical records were used

Consent violations - Prove specific individuals' data was used without permission

After You Upload Your Solution:

Review

We’ll review your submission to confirm correct exploitation. This may take up to 5 business days

Certification

Successfully completing the challenges earns you a verified digital certificate to showcase your skills

03

Recognition

Add your certificate to your LinkedIn profile and portfolio, validating your hands-on skills in AI and LLM exploitation

Hands-On Exploitation Skills
Practice prompt injection, access control issues, reverse engineering, static and dynamic analysis, and bypassing security controls that apply to real AI applications.

Real-World Scenarios
Work with multiple binaries that have vulnerabilities ranging from prompt injection, model evasion, to Adversarial perturbations issues that simulate the challenges you'd face in real world cases for AI and LLM exploitation.

Tool Proficiency
Get comfortable using tools like MCP inspector, Cursor, Adversarial Robustness Toolbox, and more in practical settings.

Security Mindset
Train yourself to think like an attacker: identify weaknesses, understand threat models, and build intuition around LLM system attack strategies and defense evasion.

Portfolio-Ready Experience
Build a strong foundation that you can showcase, whether you are applying for security roles or contributing to modern Artificial Intelligence and Model development environments.

Are you ready to test your AI and LLM Security skills?

At 8kSec, we Empower businesses with Cutting-Edge Mobile Security Training, Pioneering Vulnerability research, and Innovative product offerings

Senior Lead Security Engineer - Mobile*

A Global Banking Corporation

$147,700 - $190,000 a year

- Formal training or certification on Mobile Development concepts and 5+ years applied experience
- Strong understanding of mobile application security risks and mitigation strategies for both Android and iOS platforms
- Experience in implementing or managing mobile security operations
- Familiarity with CI/CD pipelines, DevSecOps methodologies, and secure software development practices
- Ability to collaborate with development teams on security functions & resolutions
- Hands-on practical experience delivering enterprise level cybersecurity solutions and controls
- Strong collaboration and communication skills are essential for working effectively with teams on security implementations
- Ability to evaluate current and emerging technologies to select or recommend the best solutions for future state architecture & enterprise integrations
- Proven experience leading projects from scoping to delivery

- Working closely with in-house mobile development teams, providing guidance on secure coding practices, threat mitigation strategies, and optimal use of mobile security solutions.
- Utilize our mobile security vendors and tools to drive proactive security measures, ensuring optimal configuration, monitoring, and maintenance to safeguard our mobile applications.
- Oversee the deployment, integration, and ongoing support of mobile security tools, ensuring they are effectively utilized and updated.
- Provide technical leadership in securing mobile applications and infrastructure, ensuring compliance with industry standards and best practices.
- Manage the lifecycle of mobile security tools, including planning and executing upgrades to maintain optimal performance and security.
- Work closely with cross-functional teams to enhance security awareness, provide training, and ensure adherence to security protocols. Additionally, serve as a key feedback conduit to the mobile binary scanning team, risk management, and source scanning teams, ensuring continuous improvements in security posture and alignment with organizational security strategies.

Mobile Implant Software Engineer*

A Cyber-Risk Consulting Firm

$114,000 - $180,000 a year

- Proven experience in CNO and software development, particularly in support of DoD and Intelligence community customers
- Demonstrated ability to perform advanced research and development on embedded systems, Linux, and iOS platforms
- Strong understanding of network protocols and experience in implementing support for TCP, UDP, and TLS
- Experience in designing, developing, and integrating modular cyber capabilities
- Proficiency in using and integrating CI/CD tools and practices
- Excellent problem-solving skills and the ability to design novel solutions to complex security challenges
- Strong leadership skills with the ability to guide and mentor development teams
- Programming Languages: C, C++, Python, Java, x86 Assembly, MIPS Assembly, Microblaze Assembly, ARM Assembly, ARM64 Assembly, VHDL, Verilog, XML, JSON, HTML
- Tools and Technologies: LLDB/LLVM, IDA Pro, Immunity Debugger, Immunity Canvas, Eclipse, Git, Subversion, Embedded Systems, FPGAs, Docker, Intel Performance Primitives (IPP), High Performance Computing (HPC), REDHAWK, OmniORB CORBA, Software Defined Radios (SDR), Signal Processing, MySQL, PostgreSQL, JDBC, Django, ActiveMQ, Jpype, Pyxb, STOMP

- Lead research and development projects focused on mobile vulnerabilities and cyber operations
- Design and implement innovative solutions to address operational security challenges
- Architect and develop flexible, modular cyber capabilities in C, C++, and Python
- Triage and analyze public software vulnerabilities (CVEs) for security concerns
- Provide technical support and custom solutions to high-priority customer needs
- Design and develop new client/server data distribution tools
- Implement support for multiple network protocols, including TCP, UDP, and TLS
- Create custom build systems and ensure portability using Docker
- Integrate new projects with CI/CD services to streamline development processes
- Generate and maintain unit tests to enhance the reliability of client/server applications
- Guide the development team in adhering to industry software engineering standards and best practices

Sr. Android Penetration Tester*

A Gobal Device Company

$130,000 - $166,000 a year

- 5+ years of penetration testing experience, including 2+ years of Android penetration testing and 1+ year of web application penetration testing
- Strong understanding of malware, phishing attacks, attack vectors, and security best practices
- Knowledge of penetration testing tools, threat modeling, and security frameworks
- Ability to conduct security research, CVE analysis, and adversary simulation
- Strong communication skills to work cross-functionally with engineering and security teams
- Experience working in corporate environments with internal penetration testing teams (preferred over agency-based consulting experience)
- Bachelor’s degree in either Cybersecurity, Computer Science, Information Security, or related field
Preferred Qualifications:
- Certifications in offensive security
- Published CVEs, blog posts, or walkthroughs on security research
- Malware development and reverse engineering experience
- Experience working in top security consulting firms or in-house red teams at major tech companies
- Hands-on experience with firmware penetration testing and IoT security.

- Develop expertise in our product solutions, deep diving into design/architecture, & execute white box and black box penetration scenarios
- Plan, scope and conduct vulnerability assessment/ Penetration test on internal / external facing public assets such as Web application, Android platform, Android Apps, Backend APIs, and Cloud services
- Research & and conduct adversary simulation for known security threats and identify novel attack vectors to test a system’s relative security readiness
- Conduct Threat modelling, Threat Intelligence and scoping with stakeholders
- Assist in creating and maintaining internal penetration testing and practice within QA team, managing vulnerabilities and tracking until closure
- Build Test harness & required Automation suites and validate attack vectors in Threat Lab
- Co-ordinate with program management, security architects at Internal & offshore sites
- Stays up to date on current tools, technologies, and vulnerabilities to incorporate into testing practices
- Research and developing exploits for zero-day vulnerabilities
- Conduct penetration test on IOT and Firmware Devices

Explore the Challenges

What’s Inside?

Challenge: Ultimate AI Essay Grader

Objective

Challenge: Smart Workspace Assistant

Objective

Challenge: Secure Vault

Objective

Challenge: Password Bot

Objective

Challenge: Meetings MCP

Objective

Challenge: Intern Filter

Objective

Challenge: Neural Hunter

Objective

Challenge: Supply Chain Sabotage

Objective

Challenge: Rogue Skills

Objective

Challenge: Identity Crisis

Objective

After You Upload Your Solution:

Review

Certification

Recognition

Outcomes & Takeaways

Are you ready to test your AI and LLM Security skills?

FEATURED LINKS

POLICIES

CONNECT WITH US