Web Application Penetration Testing Methodology: 2026 Guide

Every 39 seconds, a cyberattack targets a web application somewhere on the internet.

According to Verizon’s 2025 Data Breach Investigations Report, over 84% of all confirmed data breaches in 2025 involved the web application layer – a figure that has increased for five consecutive years.

In Australia, the consequences are tangible. The Medibank breach exposed 9.7 million customer records through a compromised web-facing system. The Optus breach affected 9.8 million Australians via an unsecured API. Both were web application layer failures – and both were the type of vulnerability that a proper penetration test is designed to find before an attacker does.

Automated scanners and WAFs (Web Application Firewalls) are not enough. They find known patterns. They miss business logic flaws. They generate false positives. They cannot chain low-severity findings into a high-impact attack path. Only a skilled human tester following a structured methodology can do that.

This guide explains the complete web application penetration testing methodology – from scoping through to report review – including every phase, the tools used, the vulnerabilities being tested, how black-box, grey-box, and white-box approaches differ, and what Australian businesses need to know about compliance, frequency, and how to get real value from a pentest engagement.

At CodeHyper, our penetration testing services follow the OWASP Web Security Testing Guide (WSTG) and align to the ACSC Essential Eight and Australian compliance requirements.

What Is Web Application Penetration Testing?

Web application penetration testing is a controlled, authorised security assessment in which a skilled tester simulates real-world attacks against your web application to identify exploitable vulnerabilities before malicious attackers do.

It is fundamentally different from an automated vulnerability scan.

A vulnerability scanner runs a known list of patterns against your application and reports matches. It is fast, broad, and produces results that often include significant false positives.

A penetration test goes further. A human tester:

Understands the application’s intended logic and finds ways to abuse it
Chains low-severity findings into high-impact attack paths
Explores authenticated functionality that scanners never reach
Tests for business logic flaws that have no pattern to match
Confirms every finding is genuinely exploitable – not a theoretical match

The output is a short list of confirmed, exploitable vulnerabilities – each with proof-of-concept evidence, business impact rating, and developer-ready remediation guidance.

The Frameworks Behind a Proper Web App Pentest

A credible methodology does not start from scratch. It follows established, peer-reviewed frameworks that define what must be tested and how.

OWASP Web Security Testing Guide (WSTG) v4.2 is the definitive technical standard for web application security testing. It is an open-source, community-maintained guide covering over 90 specific test cases across all areas of a web application – information gathering, authentication, session management, input validation, access control, cryptography, business logic, and API security. Every professional web app pentest should be aligned to the WSTG.

OWASP Top 10:2025 is the updated ranking of the ten most critical web application security risks, based on data collected from 2.8 million applications. The 2025 edition was released this year and reflects current threat patterns. Coverage of the full OWASP Top 10 is the baseline expectation for any web application pentest.

PTES (Penetration Testing Execution Standard) defines the full engagement lifecycle – from pre-engagement through reporting. It provides the project management framework that the WSTG sits inside.

NIST SP 800-115 is the US federal technical guide for information security testing. It is referenced by Australian compliance frameworks and provides a complementary view of the testing lifecycle.

CVSS 3.1 (Common Vulnerability Scoring System) is the standardised severity scoring system. Every confirmed vulnerability in a professional pentest report should carry a CVSS 3.1 score – this is what compliance auditors expect and what development teams use to prioritise remediation.

Black-Box vs Grey-Box vs White-Box Testing

Before a pentest begins, the testing model must be agreed. The three models represent different levels of tester knowledge going in – and each produces different results.

Black-Box Testing

The tester has no prior knowledge of the application. They receive a URL and nothing else – simulating an external attacker with no inside information.

What it tests well: Real-world external attacker exposure. Publicly visible attack surface. How much damage a completely external threat can do.

What it misses: Deep application logic flaws that only appear after authentication. Vulnerabilities in internal functionality that a real attacker would likely discover after initial access.

Best for: Assessing external-facing exposure where the client specifically wants to understand the “outside view.”

Grey-Box Testing

The tester receives credentials for each relevant user role – standard user, administrator, API consumer, business user. They understand the application structure but do not have access to source code.

What it tests well: Authenticated functionality. Privilege escalation between roles. Insecure Direct Object References (IDOR). Business logic within the authenticated application. The vast majority of exploitable vulnerabilities in real applications.

Best for: Most commercial web application assessments. The most common engagement model because it balances realistic simulation with comprehensive coverage.

White-Box Testing

The tester has full access – credentials for all roles, source code, architecture documentation, and environment details.

What it tests well: Deepest possible coverage. Logic flaws visible only in the code. Race conditions. Complex multi-step workflow abuse. Developer-introduced vulnerabilities in non-obvious code paths.

Best for: Critical applications, pre-launch assessments for high-risk systems, financial applications, and healthcare platforms where maximum coverage is required.

The practical recommendation for most Australian businesses: Grey-box is the right default. It produces the most actionable findings for the cost. Reserve white-box for your highest-risk applications – payment systems, customer portals handling sensitive personal data, or applications where a breach would have regulatory consequences.

The Web Application Penetration Testing Methodology: Phase by Phase

A structured engagement follows six phases. Here is exactly what happens in each one.

Phase 1: Scoping and Rules of Engagement

Before testing begins, both parties must agree precisely on what is being tested and under what conditions.

What is defined in scoping:

Target URLs, domains, subdomains, and IP addresses in scope
Any URLs, functions, or environments explicitly out of scope
Testing window (working hours only, or 24/7 for production systems)
Authentication credentials provided (and at what role levels)
Whether automated scanning tools may be used and at what rate
Contact details for both the testing team and the client’s technical team
Incident response procedure if the test causes unintended disruption
Data handling agreement for any sensitive data encountered during testing

A well-scoped engagement prevents the two most common pentest problems: scope creep (the test takes longer than budgeted because the scope was undefined) and accidental production impact (testing disrupts live services because boundaries were not clear).

What to provide your testing team:

A list of all in-scope URLs and API endpoints, user accounts with credentials at each role level required for the engagement, any known sensitive areas to treat with extra care (payment processing, live customer data), and the technical contact available during testing for questions and coordination.

Phase 2: Reconnaissance and Information Gathering

With scope defined, the tester systematically maps the application’s attack surface. This phase follows the WSTG’s information gathering test category (OTG-INFO).

Passive reconnaissance collects publicly available information without interacting with the target directly: DNS records, WHOIS data, historical snapshots via the Wayback Machine, job listings (which reveal technology stacks), certificate transparency logs (which reveal subdomains), and public code repositories that may contain hardcoded secrets.

Active reconnaissance involves direct interaction with the target application: spider/crawling to discover all reachable URLs and parameters, HTTP fingerprinting to identify server technology and framework, JavaScript source analysis to find client-side logic and undocumented API endpoints, and content discovery to find hidden directories and forgotten administrative interfaces.

Tools used in this phase:

Burp Suite Pro – the primary web application testing platform used by professional pentesters. Its spider, scanner, and manual testing tools are the foundation of almost every engagement.
OWASP ZAP – the open-source alternative, useful for automated discovery
Amass, Subfinder – subdomain enumeration
Shodan, Censys – passive infrastructure intelligence
Google dorking – finding exposed files, login panels, and sensitive paths indexed by search engines

What the tester is building: A comprehensive map of the application – every endpoint, every parameter, every authentication flow, every API call, every third-party integration. This map is what the subsequent phases test methodically.

Phase 3: Vulnerability Analysis

With the attack surface mapped, the tester systematically analyses each component for potential vulnerabilities. This phase aligns to the WSTG’s core test categories.

Authentication testing (WSTG-ATHN): Testing of login mechanisms for brute force protection, username enumeration (does “invalid username” respond differently from “invalid password”?), password policy enforcement, MFA implementation and bypass, account lockout logic, and OAuth/SSO implementation flaws.

Session management testing (WSTG-SESS): Analysis of session token entropy and randomness, cookie security flags (HttpOnly, Secure, SameSite), session fixation and hijacking vulnerabilities, session timeout and logout behaviour, and cross-site request forgery (CSRF) protection.

Authorisation testing (WSTG-ATHZ): Testing for Insecure Direct Object References (IDOR) – can one user access another user’s data by modifying an ID in the URL? Horizontal and vertical privilege escalation – can a standard user reach administrative functions? Path traversal – can the application be tricked into serving files outside its intended directory?

Input validation testing (WSTG-INPV): This is where injection vulnerabilities are tested. SQL injection, NoSQL injection, command injection, XML injection, LDAP injection, Server-Side Template Injection (SSTI), and HTML injection are all systematically tested across every input parameter in the application. Cross-Site Scripting (XSS) – both reflected and stored – is tested across all user-supplied input fields.

Business logic testing (WSTG-BUSL): This is what automated scanners cannot do. Business logic testing examines whether the application’s workflows can be abused: can the order of steps in a multi-step process be bypassed? Can negative quantities be submitted to a shopping cart? Can a discount code be applied multiple times? Can a rate limit be bypassed by splitting requests across concurrent threads? Business logic flaws are highly specific to each application and require a tester who understands the intended behaviour in order to probe for deviations from it.

API security testing: Modern web applications expose significant functionality through APIs. REST APIs, GraphQL endpoints, and SOAP services are tested against the OWASP API Security Top 10 – with particular focus on Broken Object Level Authorization (BOLA/IDOR), Broken Authentication, Excessive Data Exposure, and Mass Assignment vulnerabilities.

Configuration and infrastructure testing (WSTG-CONF): HTTP security header analysis (Content-Security-Policy, X-Frame-Options, HSTS, X-Content-Type-Options), TLS/SSL configuration and certificate validity, CORS policy configuration, server error message verbosity, and HTTP method permissiveness.

Phase 4: Exploitation and Proof of Concept

Identifying a potential vulnerability is not the same as confirming it is exploitable. This phase distinguishes a professional pentest from an automated scan.

For each finding from Phase 3, the tester attempts to exploit the vulnerability – not to cause damage, but to demonstrate the real-world impact.

Vulnerability chaining is where the most valuable findings emerge. A finding that appears minor in isolation becomes critical when combined with others. An information disclosure that reveals an internal directory structure becomes significant when combined with a path traversal. An IDOR that exposes only non-sensitive IDs becomes critical if those IDs can be used to escalate access elsewhere.

A skilled tester asks for each finding: what can an attacker actually achieve with this? Can it be combined with other vulnerabilities for greater impact? What data, functionality, or systems could be compromised?

What is captured as evidence:

Screenshots and HTTP request/response pairs for each confirmed vulnerability
Proof-of-concept code or payload that demonstrates the exploit
A demonstration of the business impact – what data was accessible, what action could be performed

What professional testers do not do:

Exfiltrate real customer data (a PoC demonstrates access; it does not export data)
Disrupt production services (testing is conducted in a controlled manner)
Leave persistent backdoors or modifications in the application

Phase 5: Reporting

A pentest is only as valuable as its report. A finding that cannot be communicated clearly and acted upon has delivered no security value.

A professional web application pentest report contains two distinct sections:

Executive Summary: A non-technical overview written for business stakeholders – the CISO, board, or audit committee. It covers what was tested, when, the overall risk posture, the number and severity of confirmed findings, and the most important remediation priorities. An executive summary should fit on two pages and convey the core risk position without requiring technical knowledge to interpret.

Technical Report: The detailed findings document for the development and IT team. Each vulnerability is documented with:

Vulnerability title and CVSS 3.1 severity score (Critical / High / Medium / Low / Informational)
Description – what the vulnerability is and why it exists
Evidence – screenshots, HTTP requests/responses, and proof-of-concept demonstration
Business impact – what an attacker could achieve by exploiting this finding
Remediation guidance – specific, developer-ready instructions for fixing the vulnerability
References – OWASP WSTG test case ID, CWE reference, and any relevant CVEs

Compliance attestation: For engagements where the report will be used for ISO 27001, SOC 2, PCI DSS, or APRA CPS 234 compliance purposes, the report should include a formal letter of attestation confirming the methodology used, the scope tested, and the tester’s credentials. Auditors require this letter; a technical report alone is frequently insufficient.

CVSS 3.1 severity ratings are the standard expected by compliance auditors. Reports using proprietary severity scales without CVSS scores are often rejected by auditors.

Phase 6: Remediation and Retesting

The pentest report is not the endpoint – it is the starting point for remediation.

The development and IT team address findings in priority order. Once remediation is complete, a focused retest validates that each fix was implemented correctly and that the fix did not introduce new vulnerabilities.

What a retest covers:

Each finding from the original report, tested against the same PoC
Confirmation that the root cause was addressed (not just the specific instance)
Any adjacent areas that the remediation may have affected
An updated report reflecting the remediation status of each finding

Most professional engagements include a free retest within 60 days. Ensure this is confirmed before you sign the engagement – a retest charge on top of the original assessment cost is common with lower-quality providers.

OWASP Top 10:2025 – What Is Being Tested

The OWASP Top 10:2025 is the current authoritative list of the most critical web application security risks. Every professional web app pentest must cover all ten categories. Here is a plain-English summary of what each represents.

A01: Broken Access Control

The most prevalent vulnerability in the 2021 list and still critical in 2025. Users can access data or functionality they should not be permitted to reach – another user’s account data, administrative functions, or restricted content. The OWASP data from 2.8 million applications shows access control failures in a significant proportion of tested applications.

What testers check: IDOR flaws, horizontal and vertical privilege escalation, missing function-level access controls, insecure direct URL access.

A02: Cryptographic Failures

Sensitive data – passwords, credit card numbers, health records, personally identifiable information – is inadequately protected in transit or at rest. Includes weak encryption algorithms, missing HTTPS enforcement, and improperly managed cryptographic keys.

What testers check: TLS configuration, password hashing algorithms, sensitive data in HTTP responses, cleartext transmission of sensitive fields.

A03: Injection

Injection attacks occur when untrusted data is sent to an interpreter – SQL injection, command injection, LDAP injection, Server-Side Template Injection. Injection vulnerabilities remain among the most damaging vulnerability types, enabling attackers to read or modify databases, execute system commands, or access internal infrastructure.

What testers check: All user-supplied input fields, URL parameters, HTTP headers, and API parameters tested for injection vulnerabilities using both manual testing and targeted tooling (SQLMap, manual payloads).

A04: Insecure Design

A 2021 addition addressing architectural security flaws – not just implementation bugs, but fundamentally insecure design decisions. Missing rate limiting on authentication endpoints, no account lockout, business logic that trusts client-side data.

What testers check: Rate limiting on sensitive endpoints, trust assumptions in application logic, client-side validation relied upon without server-side verification.

A05: Security Misconfiguration

Default credentials left active, verbose error messages exposing internal paths, unnecessary features enabled, cloud storage buckets publicly accessible, CORS policies too permissive. Security misconfiguration is consistently found in a majority of tested applications.

What testers check: HTTP security headers, error page responses, default paths, CORS configuration, TLS settings, unnecessary HTTP methods enabled.

A06: Vulnerable and Outdated Components

Using libraries, frameworks, or software components with known vulnerabilities. The Log4Shell vulnerability (CVE-2021-44228) is the most high-profile recent example – a critical vulnerability in a widely used logging library that affected millions of applications.

What testers check: Version fingerprinting of identified components, known CVE matching, outdated JavaScript libraries in the front end.

A07: Identification and Authentication Failures

Weak authentication implementations – missing MFA, predictable session tokens, insecure password reset flows, credential stuffing vulnerabilities, session tokens exposed in URLs.

What testers check: Session token randomness, password policy enforcement, MFA implementation and bypass, OAuth flow security, account enumeration.

A08: Software and Data Integrity Failures

Insecure deserialization and CI/CD pipeline integrity. Applications that deserialize untrusted data without validation can be compromised through crafted payloads that execute arbitrary code.

What testers check: Deserialisation endpoints, auto-update mechanisms that do not verify integrity, software supply chain exposures.

A09: Security Logging and Monitoring Failures

Insufficient logging that prevents detection of and response to attacks. Applications that do not log authentication failures, access control violations, or suspicious input give attackers the ability to operate undetected for extended periods.

What testers check: Whether test activity generates alerts, whether login failures are logged, whether the application logs IP addresses and user agents.

A10: Server-Side Request Forgery (SSRF)

The application fetches external resources based on user-supplied URLs without validating them. An attacker can abuse this to make the server request internal resources – accessing AWS metadata endpoints, internal APIs, or internal network services that are not exposed externally.

What testers check: Any URL or endpoint parameter that the server uses to fetch resources, redirect functionality, PDF/image generation services.

Manual Testing vs Automated Scanning: Understanding the Difference

This distinction matters for how you evaluate a testing provider.

	Manual Penetration Test	Automated Vulnerability Scan
Business logic flaws	Tested – requires human understanding	Not tested – no pattern to match
Vulnerability chaining	Yes – tester combines findings for impact	No – findings reported in isolation
False positive rate	Near zero – every finding manually confirmed	High – many findings require manual triage
Authenticated testing	Yes – full coverage of logged-in functionality	Partial – scanners often fail on complex auth
IDOR / access control	Yes – tested with multiple role accounts	Limited – requires context a scanner lacks
Audit acceptance	Yes – CREST/OSCP certified reports accepted	Often not accepted alone by auditors
New/novel vulnerabilities	Yes – tester can find zero-day patterns	No – relies on known signatures
Cost	Higher	Lower
Best for	Compliance, real security assurance	Continuous monitoring, baseline coverage

The correct answer is both. Automated scanning is appropriate for continuous monitoring and rapid feedback during development. Manual penetration testing is required for compliance, pre-launch assurance, and genuine security validation. They are complementary, not competing.

What the Testing Tools Actually Do

Understanding the tools helps you evaluate what your testing provider is actually doing – and spot engagements that are automated scans presented as manual pentests.

Burp Suite Pro (PortSwigger) is the primary tool used by professional web application pentesters. It intercepts and modifies HTTP traffic, provides an active scanner, and enables manual testing through its repeater, intruder, and sequencer tools. A genuinely manual pentest involves significant time in Burp Suite’s repeater – manually crafting and modifying requests – not just running the automated scanner.

OWASP ZAP is the open-source alternative to Burp Suite, widely used for automated scanning and as a complement to manual testing.

SQLMap is an automated SQL injection detection and exploitation tool. It demonstrates that a SQL injection finding is genuinely exploitable with a real database dump, not just a theoretical match.

Nikto is a web server scanner that identifies outdated software versions, security misconfigurations, and common vulnerability patterns at the server level.

Amass / Subfinder are subdomain enumeration tools used in the reconnaissance phase to discover all subdomains associated with the target organisation.

Custom scripts and manual payloads – every experienced tester writes custom payloads for the specific application being tested. Business logic testing in particular requires understanding the application deeply enough to write payloads that abuse its specific logic. This cannot be automated.

When Should You Commission a Web Application Pentest?

These are the six triggers that indicate a web application pentest is appropriate.

Annual compliance schedule. ISO/IEC 27001 (controls A.8.8 and A.8.29), SOC 2 Type II (CC7.1), and PCI DSS v4.0 (Requirement 11.4) all require at least annual penetration testing of in-scope web applications. Auditors require a CREST-certified report as evidence – not a scan report.

Pre-launch or major release. Before a new application or a significant new feature goes live – especially any feature involving authentication changes, new payment integration, new user roles, or significant data handling changes. Remediating a vulnerability pre-production is dramatically cheaper than post-breach.

After a security incident or near-miss. A confirmed breach or a suspicious access event indicates a vulnerability was exploited. A post-incident pentest validates whether the root cause is fixed and identifies adjacent weaknesses that the attacker may have found but not yet exploited.

Vendor security questionnaires. Enterprise customers and government agencies increasingly require evidence of independent penetration testing – specifically a CREST-accredited report – as a condition of procurement.

After significant infrastructure changes. New cloud provider, new authentication system, new API, new third-party integration. PCI DSS Requirement 11.4 explicitly mandates testing after any significant change.

Pre-acquisition or due diligence. Cybersecurity due diligence on a target company or SaaS platform being acquired or integrated. A web application pentest is a standard component of technical due diligence.

Compliance Mapping for Australian Businesses

A professional web application penetration test report supports compliance against the following frameworks:

ACSC Essential Eight. Web application penetration testing provides direct evidence for the Application Hardening and Patch Applications mitigation strategies. The Essential Eight Maturity Model at Level 2 and above requires regular testing of internet-facing applications. See our Essential Eight checklist for the full control mapping.

Privacy Act 1988 / APP 11. The Australian Privacy Principles require organisations to take reasonable steps to protect personal information held in web applications. A regular penetration testing programme is the clearest demonstration of these steps being taken. Following a notifiable data breach, the OAIC investigates whether reasonable technical measures were in place – and a documented testing programme is a central part of that evidence. For more context on how testing relates to your overall security posture, see our cyber risk management guidance.

ISO/IEC 27001:2022. Controls A.8.8 (management of technical vulnerabilities) and A.8.29 (security testing in development and acceptance) explicitly require technical vulnerability assessment. A.5.23 covers security for the use of cloud services. Annual web application penetration testing with a formal report provides audit evidence for each of these controls.

SOC 2 Type II. Trust Services Criterion CC7.1 (security monitoring) requires evidence of vulnerability identification and remediation. A CVSS-scored pentest report with documented remediation is the standard evidence form for this criterion.

PCI DSS v4.0. Requirement 11.4 mandates annual web application penetration testing for any application involved in cardholder data processing, storage, or transmission – and after any significant change to the environment.

APRA CPS 234. Australian financial services entities regulated by APRA must regularly test the effectiveness of information security controls. Web application penetration testing is a direct response to this obligation for any financial institution with a customer-facing web application.

How to Read Your Pentest Report

Most businesses receive a pentest report and do not know how to prioritise the findings or what to do next.

Start with the CVSS scores. Critical (9.0–10.0) and High (7.0–8.9) findings need immediate attention regardless of other priorities. These are exploitable vulnerabilities with significant potential impact. Remediate all Critical and High findings before the retest.

Read the business impact section for each finding. The CVSS score tells you technical severity. The business impact section tells you what it means in operational terms – “an attacker can access all customer records” is more actionable than “CVSS 9.8 Critical.” Match the business impact to your actual data and user base to prioritise effectively.

Do not dismiss Informational and Low findings. An Informational finding on its own may be harmless. Combined with a Medium finding elsewhere in the application, it may enable a meaningful attack chain. Review all findings with your tester’s context, not just the score.

Assign remediation ownership immediately. Each finding should be assigned to a developer or team with a target remediation date. Unassigned findings drift. Use your issue tracker (Jira, GitHub Issues, Azure DevOps) to create a ticket for each finding, linked to the report.

Verify root cause remediation, not just instance remediation. If a SQL injection is found in one input field, the fix is not just parameterised queries for that specific input – it is parameterised queries everywhere in the codebase. Testers frequently find the same vulnerability type in multiple places because the root cause (a shared library, a development pattern, a framework configuration) was not addressed.

Confirm your retest is scheduled. The retest validates that fixes work. Do not close your findings tracking until the retest report confirms remediation. A “fixed” finding that was not retested is an assumption, not a confirmation.

A Real-World Example: What a Web App Pentest Finds

To illustrate what this methodology produces in practice, here is an anonymised summary of findings from a grey-box web application penetration test conducted for an Australian professional services client.

The application was a customer portal handling sensitive client documents and communication history.

Critical – SQL Injection in search function: The application’s document search parameter was not parameterised. Testing with a standard SQLMap payload confirmed database read access. The tester was able to extract the full user table, including hashed passwords and email addresses, from a read-only test database replica – without any authentication. CVSS 9.8.

High – IDOR in document download endpoint: The document download URL used a sequential numeric document ID. By iterating the ID parameter while authenticated as a low-privilege user, the tester could download documents belonging to any other client. 3,400 client documents were accessible to any authenticated user. CVSS 8.1.

High – Missing MFA bypass protection: The application’s MFA implementation used a 6-digit TOTP code but had no rate limiting on code attempts. A brute force attack across 1,000,000 possible codes was feasible within the TOTP window. CVSS 7.5.

Medium – Verbose error messages exposing stack traces: Submitting malformed input to the file upload endpoint returned a full .NET stack trace including internal server paths and framework version. This provided reconnaissance value to an attacker. CVSS 5.3.

Low – Missing security headers: Content-Security-Policy and X-Frame-Options headers were absent, leaving the application exposed to clickjacking and certain cross-site scripting injection vectors. CVSS 3.7.

The Critical and High findings were remediated within the 30-day remediation window. A retest confirmed all three high-severity vulnerabilities were addressed. The compliance report was accepted by the client’s ISO 27001 auditor at their next scheduled review.

This is what a grey-box engagement against a real application produces. The SQL injection alone – present in a customer portal handling sensitive legal documents – was a breach waiting to happen.

If your web application handles customer data, processes payments, or is subject to any compliance obligation, contact our team to discuss a penetration testing engagement.

Frequently Asked Questions

What is web application penetration testing methodology?

Web application penetration testing methodology is the structured, phase-by-phase process a security tester follows to systematically identify and validate exploitable vulnerabilities in a web application. The standard methodology is defined by the OWASP Web Security Testing Guide (WSTG), which covers information gathering, configuration testing, authentication testing, session management, authorisation, input validation, business logic, and API security. A methodology-aligned test ensures consistent, repeatable, and audit-accepted results – as opposed to an ad hoc scan.

What is the difference between black-box, grey-box, and white-box web application penetration testing?

Black-box testing gives the tester only the application URL – simulating a completely external attacker. Grey-box testing provides the tester with user credentials at each role level, enabling testing of authenticated functionality. White-box testing provides credentials, source code, and architecture documentation for maximum coverage. For most commercial engagements, grey-box is the recommended approach – it balances realistic simulation with comprehensive coverage of the vulnerabilities that cause real-world breaches.

What does OWASP Top 10:2025 cover?

The OWASP Top 10:2025 lists the ten most critical web application security risks: Broken Access Control (A01), Cryptographic Failures (A02), Injection (A03), Insecure Design (A04), Security Misconfiguration (A05), Vulnerable and Outdated Components (A06), Identification and Authentication Failures (A07), Software and Data Integrity Failures (A08), Security Logging and Monitoring Failures (A09), and Server-Side Request Forgery / SSRF (A10). The 2025 edition was updated based on data from 2.8 million applications and reflects current threat patterns. Every professional web application penetration test should cover all ten categories.

How long does a web application penetration test take?

A typical grey-box web application penetration test for a moderately complex application takes 5–10 business days from start to report delivery. Simple single-function applications may take 3–5 days. Complex platforms with multiple user roles, extensive API surface, and significant business logic may require 2–3 weeks. The scoping phase determines the timeline. Build in additional time for the remediation window (typically 15–30 days) and the retest (1–2 days), so a full engagement from kick-off to final report is typically 4–8 weeks.

What credentials do I need to provide for a grey-box test?

For a grey-box engagement, provide a working account at each user role your application uses – standard user, premium user, admin, API consumer, support agent, or any other distinct role with different access permissions. Each account should have access to representative test data so the tester can meaningfully explore authenticated functionality. Test accounts (not real customer accounts) should be used, in a test or staging environment where possible. Where production testing is required, the scope and approach must be agreed carefully in the scoping phase.

What is CVSS and why does it appear in pentest reports?

CVSS (Common Vulnerability Scoring System) is the standardised framework for rating the severity of security vulnerabilities. Version 3.1 is the current standard. Scores range from 0–10 and are categorised as Critical (9.0–10.0), High (7.0–8.9), Medium (4.0–6.9), Low (0.1–3.9), or Informational (0). CVSS scores appear in pentest reports because compliance auditors – for ISO 27001, SOC 2, PCI DSS, and APRA – require standardised severity evidence. A report using proprietary severity scales without CVSS scores is frequently rejected or queried by auditors.

Why is manual testing better than running an automated scanner?

Automated scanners match known patterns against known signatures. They miss business logic vulnerabilities (there is no pattern to match for “this checkout workflow can be abused to get products for free”), they cannot test accurately in complex authenticated scenarios, they produce significant false positives that require manual triage, and their reports are typically not accepted alone by compliance auditors. Manual testing by a qualified penetration tester finds vulnerabilities that automated tools miss, confirms every finding is genuinely exploitable, chains findings for real-world impact, and produces an audit-accepted report. For genuine security assurance and compliance evidence, manual penetration testing is required.

How often should web application penetration testing be done?

ISO 27001, SOC 2, and PCI DSS all expect at least annual penetration testing of in-scope web applications. PCI DSS additionally requires testing after any significant change to the application or its environment. For Australian businesses under APRA CPS 234, regular testing of information security controls is an explicit obligation. Beyond compliance schedules, best practice is to test before major releases, after significant architectural changes, and following any security incident or near-miss. Continuous automated scanning between annual penetration tests provides an additional layer of coverage.

What should be in a professional web application penetration test report?

A professional report contains two sections. The executive summary covers what was tested, the overall risk posture, and the most important remediation priorities – written for non-technical stakeholders. The technical report documents every finding with a CVSS 3.1 severity score, description, HTTP evidence and proof-of-concept, business impact statement, and specific developer-ready remediation guidance. For compliance use, the report includes a formal attestation letter confirming the testing methodology, scope, and tester qualifications (CREST/OSCP). Reports from automated scanners, or reports lacking CVSS scoring and attestation, are frequently rejected by compliance auditors.

Web Application Penetration Testing Methodology: 2026 Guide

What Is Web Application Penetration Testing?

The Frameworks Behind a Proper Web App Pentest

Black-Box vs Grey-Box vs White-Box Testing

Black-Box Testing

Grey-Box Testing

White-Box Testing

The Web Application Penetration Testing Methodology: Phase by Phase

Phase 1: Scoping and Rules of Engagement

Phase 2: Reconnaissance and Information Gathering

Phase 3: Vulnerability Analysis

Phase 4: Exploitation and Proof of Concept

Phase 5: Reporting

Phase 6: Remediation and Retesting

OWASP Top 10:2025 – What Is Being Tested

A01: Broken Access Control

A02: Cryptographic Failures

A03: Injection

A04: Insecure Design

A05: Security Misconfiguration

A06: Vulnerable and Outdated Components

A07: Identification and Authentication Failures

A08: Software and Data Integrity Failures

A09: Security Logging and Monitoring Failures

A10: Server-Side Request Forgery (SSRF)

Manual Testing vs Automated Scanning: Understanding the Difference

What the Testing Tools Actually Do

When Should You Commission a Web Application Pentest?

Compliance Mapping for Australian Businesses

How to Read Your Pentest Report

A Real-World Example: What a Web App Pentest Finds

Related Reading

Frequently Asked Questions

Related Posts

How to Manage OneDrive Storage: Free Up Space in Minutes

Microsoft Teams Workflow Automation: Complete 2026 Guide

Microsoft Teams Security Best Practices: 2026 Checklist

What Is Identity and Access Management? Plain-English Guide

Microsoft 365 Features Explained: Complete 2026 Guide

Web Application Penetration Testing Methodology: 2026 Guide

What Is Web Application Penetration Testing?

The Frameworks Behind a Proper Web App Pentest

Black-Box vs Grey-Box vs White-Box Testing

Black-Box Testing

Grey-Box Testing

White-Box Testing

The Web Application Penetration Testing Methodology: Phase by Phase

Phase 1: Scoping and Rules of Engagement

Phase 2: Reconnaissance and Information Gathering

Phase 3: Vulnerability Analysis

Phase 4: Exploitation and Proof of Concept

Phase 5: Reporting

Phase 6: Remediation and Retesting

OWASP Top 10:2025 – What Is Being Tested

A01: Broken Access Control

A02: Cryptographic Failures

A03: Injection

A04: Insecure Design

A05: Security Misconfiguration

A06: Vulnerable and Outdated Components

A07: Identification and Authentication Failures

A08: Software and Data Integrity Failures

A09: Security Logging and Monitoring Failures

A10: Server-Side Request Forgery (SSRF)

Manual Testing vs Automated Scanning: Understanding the Difference

What the Testing Tools Actually Do

When Should You Commission a Web Application Pentest?

Compliance Mapping for Australian Businesses

How to Read Your Pentest Report

A Real-World Example: What a Web App Pentest Finds

Related Reading

Frequently Asked Questions

Related Posts

How to Manage OneDrive Storage: Free Up Space in Minutes

OneDrive Security Best Practices: Complete 2026 Guide

Microsoft Teams Workflow Automation: Complete 2026 Guide

Microsoft Teams Security Best Practices: 2026 Checklist

What Is Identity and Access Management? Plain-English Guide

Microsoft 365 Features Explained: Complete 2026 Guide

10% Off Microsoft 365