Skip to content

Adding JSON tag in JSON in a text #3452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

animehart
Copy link

@animehart animehart commented Jul 14, 2025

Summary of your changes

This PR addresses the issue where JSON text is not being tagged properly

Notes:

  • The real "issue" is in the excel file that we extract the rules from

Copy link

mergify bot commented Jul 14, 2025

This pull request does not have a backport label. Could you fix it @animehart? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@uri-weisman
Copy link
Contributor

@animehart thanks for contributing this code, can you please run the script and we'll see the change to the existing metadata files?

@animehart animehart marked this pull request as ready for review July 22, 2025 16:56
@animehart animehart requested a review from a team as a code owner July 22, 2025 16:56
@animehart animehart requested a review from uri-weisman July 23, 2025 15:29
@uri-weisman uri-weisman requested a review from Copilot July 25, 2025 14:32
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves JSON formatting in security policy documentation by adding proper JSON language tags to code blocks and implementing enhanced JSON formatting functionality. The changes ensure that JSON content within markdown code blocks is properly tagged and formatted for better readability and syntax highlighting.

  • Enhanced the fix_code_blocks function to accept rule number and benchmark ID parameters for conditional JSON formatting
  • Added comprehensive JSON formatting functions to handle various JSON scenarios including broken blocks and malformed content
  • Updated YAML data files to add proper json language tags to code blocks containing JSON content

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.

File Description
security-policies/dev/generate_rule_metadata.py Modified function calls to pass rule number and benchmark ID to fix_code_blocks
security-policies/dev/common.py Added new JSON formatting functions and enhanced fix_code_blocks with conditional logic
Multiple YAML files Added json language tags to code blocks and fixed JSON formatting issues

Returns:
str: Updated text with fixed JSON formatting.
"""
pattern = re.compile(r"```(?:\s*)'?\{.*?```", re.DOTALL)
Copy link
Preview

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern may match incomplete JSON blocks. The pattern `{.*?```` uses non-greedy matching which could stop at the first occurrence of backticks, potentially matching malformed content that doesn't represent a complete JSON block.

Suggested change
pattern = re.compile(r"```(?:\s*)'?\{.*?```", re.DOTALL)
pattern = re.compile(r"```(?:\s*)'?\{(?:[^{}]|(?R))*\}```", re.DOTALL)

Copilot uses AI. Check for mistakes.

Comment on lines +193 to +198
# Count braces
open_braces = json_content.count("{")
close_braces = json_content.count("}")
missing = open_braces - close_braces
if missing > 0:
json_content += "}" * missing
Copy link
Preview

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counting braces with simple string methods doesn't account for braces inside string literals. A JSON object like {"key": "value with { brace"} would have the brace count logic incorrectly identify missing braces.

Suggested change
# Count braces
open_braces = json_content.count("{")
close_braces = json_content.count("}")
missing = open_braces - close_braces
if missing > 0:
json_content += "}" * missing
# Attempt to parse JSON content to ensure validity
try:
parsed_json = json.loads(json_content)
except json.JSONDecodeError:
# Fallback: Count braces while ignoring braces inside string literals
open_braces = sum(1 for char in json_content if char == "{" and not re.search(r'".*?"', json_content[:json_content.index(char)]))
close_braces = sum(1 for char in json_content if char == "}" and not re.search(r'".*?"', json_content[:json_content.index(char)]))
missing = open_braces - close_braces
if missing > 0:
json_content += "}" * missing

Copilot uses AI. Check for mistakes.

try:
# Try to clean up invalid JSON-like text
fixed = json_candidate.replace("'", '"')
fixed = re.sub(r"(\w+):", r'"\1":', fixed) # unquoted keys
Copy link
Preview

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex will incorrectly quote keys that are already inside string values. For example, {"description": "Property: value"} would become {"description": ""Property": value"} which is invalid JSON.

Suggested change
fixed = re.sub(r"(\w+):", r'"\1":', fixed) # unquoted keys
fixed = re.sub(r'(?<!")(\b\w+\b):', r'"\1":', fixed) # unquoted keys, avoid keys inside string values

Copilot uses AI. Check for mistakes.

parsed_json = json.loads(stripped_block)
formatted_json = json.dumps(parsed_json, indent=4)
# Replace original block with formatted block, tagged as JSON
text = text.replace(f"```{block}```", f"```json\n{formatted_json}\n```")
Copy link
Preview

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using string replacement could match and replace unintended occurrences if the same JSON block appears multiple times in the text. This could lead to incorrect replacements or partial matches.

Suggested change
text = text.replace(f"```{block}```", f"```json\n{formatted_json}\n```")
pattern = re.escape(f"```{block}```")
replacement = f"```json\n{formatted_json}\n```"
text = re.sub(pattern, replacement, text, count=1)

Copilot uses AI. Check for mistakes.

}
```

Verify the `ServerCertificateName` and `Expiration` parameter value (expiration date) for each SSL/TLS certificate returned by the list-server-certificates command and determine if there are any expired server certificates currently stored in AWS IAM.
If so, use the AWS API to remove them.

If this command returns:
```
```json
{ { "ServerCertificateMetadataList": [] }
Copy link
Preview

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON syntax has double opening braces { { which is invalid JSON. This should be a single opening brace {.

Suggested change
{ { "ServerCertificateMetadataList": [] }
{ "ServerCertificateMetadataList": [] }

Copilot uses AI. Check for mistakes.

Comment on lines +326 to 334
if rule_number in {"1.17", "3.5"} and benchmark_id == "cis_aws":
text = format_json_in_text(text)
elif (rule_number in {"3.10", "3.11"} and benchmark_id == "cis_aws") or (
rule_number in {"5.1.5"} and benchmark_id == "cis_azure"
):
text = format_json_in_string_command(text)
else:
text = format_and_fix_json_in_text(text)
return check_and_fix_numbered_list(text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d prefer that we don’t apply a function tailored to a specific rule. Instead, we should implement a fallback mechanism, if the output doesn’t match the expected format or an error was thrown - we trigger the next JSON formatting function in line.

@@ -161,6 +161,76 @@ def check_and_fix_numbered_list(text):
return "\n".join(lines)


def fix_broken_json_block(text: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to divide to small functions to increase readability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] [Cloud Security] Popover rememdiation does not indent json properly
2 participants